arXivDaily arXiv每日学术速递 周一至周五更新
重置
2606.12785 2026-06-12 cs.GT 新提交

The No-show Paradox in Single Transferable Vote under One-dimensional Preferences

一维偏好下单一可转移投票中的缺席悖论

Farhad Mohsin

AI总结 研究一维偏好模型下单一可转移投票(STV)的群体缺席悖论,发现极端选民弃权易引发悖论,且随候选人数增加概率显著上升。

详情
AI中文摘要

群体缺席悖论(GNSP)是指一组选民弃权后,新获胜者更受他们偏好。先前研究表明,即使对于易受此悖论影响的投票规则,在实际选举和多种假设下,该悖论也罕见发生。然而,我们发现,在一维偏好模型(如1D-Euclidean、单峰或单交叉偏好)下,流行的 runoff 规则——单一可转移投票(STV)——极易受到 GNSP 的影响。这与另一类易受 GNSP 影响的规则——Condorcet 规则——形成鲜明对比,后者在这些一维偏好下不会出现悖论。我们从理论上识别了 STV 在一维偏好模型下发生 GNSP 的易于处理且普遍存在的充分条件。通过理论结果和来自这些领域的合成偏好配置实验,我们证明一维频谱两端的选民特别容易因弃权而引发 GNSP。此外,随着备选方案数量的增加,发生的可能性显著增加。

英文摘要

The group no-show paradox (GNSP) occurs when a group of agents abstaining from voting can make the new winner more preferred to them. Previous work has suggested that even for voting rules susceptible to this paradox, it is a rare occurrence in real elections and under various assumptions. However, we find that under one-dimensional preference models such as 1D-Euclidean, single-peaked, or single-crossing preferences, Single Transferable Vote (STV), a popular runoff rule, is highly vulnerable to GNSP. This is in stark contrast to Condorcet rules, another family of rules susceptible to GNSP, where the paradox cannot occur under these one-dimensional preferences. We theoretically identify tractable and prevalent sufficient conditions for GNSP to occur for STV under one-dimensional preference models. Through our theoretical results and experiments with synthetic preference profiles from these domains, we demonstrate that voters at the extremes of the 1D spectrum are particularly likely to cause GNSP by abstaining. Furthermore, the likelihood of occurrence increases substantially as the number of alternatives grows.

2606.12783 2026-06-12 cs.AI 新提交

A Tutorial on World Models and Physical AI

世界模型与物理AI教程

Il-Seok Oh

发表机构 * Department of Computer Science and Artificial Intelligence/CAIIT, Jeonju, Jeonbuk, South Korea(韩国全北全州计算机科学与人工智能系/CAIIT)

AI总结 本文提出统一框架,区分显式与隐式世界模型,并探讨其在机器人、自动驾驶等物理AI领域的应用,以及迈向通用人工智能的挑战。

详情
AI中文摘要

世界建模正成为构建具备预测、推理和决策能力的智能系统的核心原则。显式世界模型与隐式世界模型之间存在一个核心区别:前者学习结构化动态以进行基于推演的推理和规划,后者则将预测结构编码到可扩展的学习表示中。这些互补范式为机器人、自动驾驶等领域的物理AI奠定了基础,使其能够在现实世界约束下实现超越反应式控制的智能。近期的基础模型进一步指明了通向集成感知、预测和行动的通用系统的路径。尽管进展迅速,但在层次推理、长时域规划和自主目标形成方面仍存在重大挑战,这些对于迈向通用人工智能至关重要。本教程提出了一个连贯的框架,其中多种世界建模方法通过共享的预测结构得以统一,并通过这种结构的表示和利用方式加以区分。

英文摘要

World modeling is emerging as a central principle for building intelligent systems capable of prediction, reasoning, and decision making. A central distinction can be drawn between explicit world models, which learn structured dynamics for rollout-based reasoning and planning, and implicit world models, which encode predictive structure within scalable learned representations. These complementary paradigms provide a foundation for physical AI in domains such as robotics and autonomous driving, enabling intelligence beyond reactive control under real-world constraints. Recent foundation models further suggest a pathway toward unified systems integrating perception, prediction, and action. Despite rapid progress, major challenges remain in hierarchical reasoning, long-horizon planning, and autonomous goal formation, which are critical for advancing toward artificial general intelligence. This tutorial presents a coherent framework in which diverse world modeling approaches are unified through shared predictive structure and differentiated by how such structure is represented and exploited.

2606.12780 2026-06-12 cs.LG cs.CL 新提交

ProPlay: Procedural World Models for Self-Evolving LLM Agents

ProPlay: 用于自我进化LLM智能体的程序化世界模型

Yijun Ma, Zehong Wang, Yiyang Li, Ziming Li, Xiaoguang Guo, Weixiang Sun, Chuxu Zhang, Yanfang Ye

发表机构 * University of Notre Dame(圣母大学) University of Connecticut(康涅狄格大学)

AI总结 提出ProPlay程序化世界模型,通过程序级预演和因果过程图,使LLM智能体在部分可观测环境中自我进化,无需外部监督。

详情
AI中文摘要

自我进化智能体应能在无外部监督下通过交互改进,但在部分可观测环境中仍困难,智能体必须主动探索、从有限反馈中学习,并决定何时信任先前经验。现有的LLM智能体方法通常依赖记忆或规划模块,但很少在它们之间闭环以持续完善对环境动态的内部理解。我们提出ProPlay,一种程序化世界模型,支持程序级预演,智能体可利用学到的世界知识排练未来的程序路径。ProPlay不将经验表示为孤立的规则或低层动作约束,而是将成功轨迹抽象为程序,并在捕获任务阶段间因果转换的程序图中组织它们。每个转换与一个可靠性记录嵌入相关联,以从过去结果中估计其任务特定贡献。在每个回合前,ProPlay在已知图结构上模拟未来程序轨迹作为结构化软指导;执行后,它利用环境反馈精炼图。在公开基准上的实验表明,ProPlay在环境理解和自我进化能力上持续优于强基线。我们的代码已在此https URL发布。

英文摘要

Self-evolving agents are expected to improve through interaction without external supervision, but this remains difficult in partially observable environments where agents must explore actively, learn from limited feedback, and decide when to trust prior experience. Existing LLM-agent methods often rely on memory or planning modules, yet they rarely close the loop between them to continually refine an internal understanding of environment dynamics. We introduce ProPlay, a procedural world model that supports procedure-level preplay, where agents can rehearse future procedural paths using the learned world knowledge. Rather than representing experience as isolated rules or low-level action constraints, ProPlay abstracts successful trajectories into procedures and organizes them in a procedure graph that captures causal transitions among task stages. Each transition is associated with a reliability record embedding to estimate its task-specific contribution from past outcomes. Before each episode, ProPlay simulates future procedural trajectories over known graph structures as structured soft guidance; after execution, it refines the graph using environment feedback. Experiments on public benchmarks show that ProPlay consistently improves environment understanding and self-evolution capability over strong baselines. Our code has been released in this https URL.

2606.12774 2026-06-12 eess.SY cs.AI cs.CL 新提交

Agentic MPC for Semantic Control System Resynthesis

用于语义控制系统再综合的智能体MPC

Yuya Miyaoka, Masaki Inoue

AI总结 提出智能体MPC框架,通过集成大语言模型智能体实现上下文感知的语义自适应控制综合,在自动驾驶场景中验证其根据个人偏好或社交情境(如避让应急车辆)调整控制的能力。

详情
Comments
7 pages, 5 figures
AI中文摘要

虽然MPC有效处理结构化、多样化和低层级的规范,但它缺乏动态融入高层级上下文信息(如社会规范、用户意图或自然语言指令)的能力。为解决这一局限,本文引入了一种智能体MPC框架,通过集成基于大语言模型的智能体,实现上下文感知、语义自适应的控制综合。该智能体解释异构输入,包括自然语言消息、环境观测和外部知识,以重新综合控制规范。该框架的有效性在自动驾驶场景中得到验证,系统能够根据个人偏好或对社交情境(如应急车辆避让)做出响应。

英文摘要

While MPC effectively handles structured, diverse, and low-level specifications, it lacks the capability to dynamically incorporate high-level contextual information such as social norms, user intent, or natural language instructions. To address this limitation, this manuscript introduces an agentic MPC framework that enables context-aware, semantically adaptive control synthesis by integrating with large language model-based agents. The agent interprets heterogeneous inputs, including natural language messages, environmental observations, and external knowledge, to resynthesize the control specifications. The effectiveness of the framework is demonstrated in an autonomous driving scenario, where the system aligns with personal preferences or responds to social situations such as emergency vehicle yielding.

2606.12767 2026-06-12 cs.AI 新提交

Constructing Evaluation Datasets for Procedural Reasoning: Balancing Naturalness, Grounding, and Multi-Hop Coverage

构建程序性推理评估数据集:平衡自然性、基础性和多跳覆盖

Sarah Elshabrawy, Rahul K. Dass, Ashok K. Goel

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 研究基于任务-方法-知识(TMK)模型的问题生成策略对程序性和多跳推理数据集质量的影响,提出基础性验证框架,发现严格TMK生成策略在基础性和可用性上最优。

详情
Comments
10 pages, 2 numbered figures. Workshop submission to HAIL @ AIED 2026
AI中文摘要

评估AI辅助学习系统中的程序性推理需要问答数据集,这些数据集既要像学习者一样,又要基于系统预期使用的教学知识。我们研究了基于TMK的问题生成策略如何影响程序性和多跳推理的数据集质量。我们比较了三种策略:从任务-方法-知识(TMK)模型严格生成、先转录后基于TMK过滤的生成、以及结合转录和结构化指导的TMK感知生成。为了评估生成的项目,我们引入了一个基于从TMK模型中提取的闭集证据单元的基础性验证框架。该框架衡量答案是否由底层表示支持、问题是否自包含、以及是否针对多跳程序性推理。在23个教学主题和690个生成的问答对中,严格TMK生成实现了最强的整体质量,其中96.5%的问题有基础,92.6%的问题可用。先转录生成产生更像学习者的问题,但更多是上下文依赖或基础薄弱的问题,而TMK感知生成产生较高的原始多跳覆盖率但基础性较低。这些结果表明,程序丰富性和自然措辞并不能保证表示基础性,这促使在AI辅助学习中的评估数据集需要进行显式的表示感知验证。

英文摘要

Evaluating procedural reasoning in AI-supported learning systems requires question-answer datasets that are both learner-like and grounded in the instructional knowledge the system is expected to use. We study how TMK-based question generation strategies affect dataset quality for procedural and multi-hop reasoning. We compare three strategies: strict generation from Task-Method-Knowledge (TMK) models, transcript-first generation with post-hoc TMK filtering, and TMK-aware generation that combines transcripts with structured guidance. To evaluate generated items, we introduce a grounding validation framework based on closed-set evidence units extracted from TMK models. The framework measures whether answers are supported by the underlying representation, whether questions are self-contained, and whether they target multi-hop procedural reasoning. Across 23 instructional topics and 690 generated question-answer pairs, strict TMK generation achieves the strongest overall quality, with 96.5% grounded questions and 92.6% usable questions. Transcript-first generation produces more learner-like questions but more context-dependent or weakly grounded items, while TMK-aware generation yields high raw multi-hop coverage but lower grounding. These results show that procedural richness and natural phrasing do not guarantee representational grounding, motivating explicit representation-aware validation for evaluation datasets in AI-supported learning.

2606.12765 2026-06-12 cs.CL cs.DC 新提交

Rigel: Reverse-Engineering the Metal 4.1 Tensor Compute Path on the Apple M4 Max GPU

Rigel:逆向工程 Apple M4 Max GPU 上的 Metal 4.1 张量计算路径

Ramchand Kumaresan

AI总结 通过微基准测试逆向工程 Apple M4 Max 的 Metal 4.1 张量计算路径,揭示 fp8 matmul2d 为模拟而非硬件加速,并重建了 8x8 张量片段布局。

详情
AI中文摘要

Apple 的 Metal 4.1 暴露了一条张量计算路径:基于 cooperative_tensor 片段的 Metal Performance Primitives (MPP) matmul2d 操作,其接口有文档记录,但硬件行为被故意隐藏。规范说明了支持哪些数据类型行,但从未说明它们是否经过硬件加速、操作在物理上何处执行、其累加器宽度是多少,或者如何在线程间划分矩阵片段。我们提出了 Rigel,这是对单个 Apple M4 Max(前神经加速器一代)上该路径的经验性表征。使用校验和门控、来源追踪的微基准测试工具,Rigel 恢复了 v4.1 规范隐藏或矛盾的十一个事实。主要发现:Metal 4.1 fp8 (E4M3) matmul2d 是模拟的,而非加速的:尽管读取的操作数字节数减半,但其吞吐量仅为 fp16 的 0.94 倍,因此在 M4 上它是一个内存占用特性,而非性能特性。我们进一步通过三信号三角测量(吞吐量上限、与 simdgroup_matrix 的比较以及每路功率归因)表明,matmul2d 完全在 GPU 着色器核心上执行,没有专用的矩阵数据路径,也没有证据表明路由到 Apple 神经引擎;它使用 >=fp32 累加;并且我们重建了 Apple 在任何地方都没有记录的 opaque 8x8 cooperative_tensor 片段布局。基于该表征,一个手动融合的 GEMM + bias + GELU 内核在缓存驻留状态下比分解路径快 6.5-12.9%。所有发现均可从 MIT 许可的代码和逐单元 CSV 中重现。

英文摘要

Apple's Metal 4.1 exposes a tensor compute path: the Metal Performance Primitives (MPP) matmul2d operation over cooperative_tensor fragments, whose interface is documented but whose hardware behavior is deliberately hidden. The specification states which data-type rows are supported, never whether they are hardware-accelerated, where the operation physically executes, what its accumulator width is, or how it partitions matrix fragments across threads. We present Rigel, an empirical characterization of this path on a single Apple M4 Max (a pre-neural-accelerator generation). Using a checksum-gated, provenance-tracked microbenchmark harness, Rigel recovers eleven facts the v4.1 specification hides or contradicts. The headline finding: the Metal 4.1 fp8 (E4M3) matmul2d is emulated, not accelerated: it sustains 0.94x the throughput of fp16 despite reading half the operand bytes, so on M4 it is a memory-footprint feature, not a performance feature. We further show, via a three-signal triangulation (throughput ceiling, comparison against simdgroup_matrix, and per-rail power attribution), that matmul2d executes entirely on the GPU shader cores with no dedicated matrix datapath and no evidence of Apple Neural Engine routing; that it accumulates in >=fp32; and we reconstruct the opaque 8x8 cooperative_tensor fragment layout Apple documents nowhere. Acting on the characterization, a hand-fused GEMM + bias + GELU kernel beats the decomposed path by +6.5-12.9% in the cache-resident regime. All findings are reproducible from committed MIT-licensed code and per-cell CSVs.

2606.12764 2026-06-12 cs.LG cs.CL cs.CR 新提交

Detecting Functional Memorization in Code Language Models

检测代码语言模型中的功能记忆

Matthieu Meeus, Anil Ramakrishna, Matthew Grange, Zheng Xu, Luca Melis

发表机构 * Meta Imperial College London(伦敦帝国学院)

AI总结 研究代码语言模型的功能记忆现象,通过反事实设置对比暴露目标代码的模型与未暴露的参考模型,使用文本和功能相似性度量,发现功能记忆超出文本重叠的检测范围。

详情
AI中文摘要

大型语言模型(LLMs)越来越多地被用于大规模生成代码。同时,先前的工作通过审计训练示例与模型生成之间的文本重叠,研究了训练数据是否可以从模型输出中恢复。然而,代码可能在功能上等价而在文本上不相似。在这项工作中,我们研究了功能记忆:提取超出逐字指标检测的功能逻辑。我们为Olmo-3-32B构建了一个反事实设置,将中期训练模型(暴露于目标代码)与预训练参考模型(未暴露)进行比较。我们使用Python函数签名提示两个模型,并测量文本和功能相似性(即LLM作为评判者、基于执行)。我们的结果显示了功能记忆的明确证据,突出了需要超越文本重叠的审计指标。

英文摘要

Large language models (LLMs) are increasingly used to generate code at scale. Meanwhile, prior work has investigated whether training data may be recoverable from model outputs, by auditing the textual overlap between training examples and model generations. Code, however, can be functionally equivalent while textually dissimilar. In this work, we study functional memorization: extraction of functional logic beyond what verbatim metrics detect. We construct a counterfactual setup for Olmo-3-32B, comparing a midtrained model (exposed to target code) against a pretrained reference (not exposed). We prompt both models with Python function signatures and measure both textual and functional similarity (i.e., LLM-as-a-judge, execution-based). Our results show clear evidence of functional memorization, highlighting the need for auditing metrics that go beyond textual overlap.

2606.12763 2026-06-12 cs.LG cs.DS 新提交

Adaptive Weighted Averaging

自适应加权平均

Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, Manish Purohit

发表机构 * University of Utah(犹他大学) Boston University(波士顿大学) Google(谷歌)

AI总结 提出一种从单次无偏估计中选取最大未知值的方法,具有可容许性且不劣于基线,应用于随机优化获得在线到批次的转换界限。

详情
AI中文摘要

我们研究在仅对每个 $x_i$ 有一个无偏估计 $y_i$ 的情况下,从 $n$ 个未知值 $x_1,\dots,x_n$ 中选择最大值的问题。我们设计的策略同时具有可容许性(不被任何其他策略一致支配)且不劣于给定的基线(如均匀随机选择)。我们将其应用于随机优化,获得了具有理想“无妥协”保证的在线到批次转换界限:它们从不比标准随机迭代选择差,同时在良性设置中可以显著更好。

英文摘要

We study the problem of selecting the largest among $n$ unknown values $x_1,\dots,x_n$ given only a single unbiased estimate $y_i$ for each $x_i$. We design strategies that are simultaneously admissible (not uniformly dominated by any other strategy) and also never worse than a given baseline such as uniform random selection. We provide an application to stochastic optimization, where we obtain online-to-batch conversion bounds with a desirable "no-compromise" guarantee: they are never worse than standard random iterate selection, and yet can be significantly better in benign settings.

2606.12759 2026-06-12 cs.RO 新提交

Sparse2Act: Learning Action-Aligned Sparse 3D Representations for Cross-Domain Robot Manipulation

Sparse2Act: 学习跨域机器人操作的动作对齐稀疏3D表示

Yu Guo, Chang Yu, Siyu Ma, Yunuo Chen, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校) University of California, San Diego(加州大学圣迭戈分校) University of Utah(犹他大学)

AI总结 提出Sparse2Act框架,通过动作对齐的掩码稀疏3D编码预训练,实现跨域机器人操作,在LIBERO-10上达86.9%成功率,并支持域迁移和sim-to-real。

详情
AI中文摘要

显式3D表示对于操作任务具有吸引力,因为它们以度量坐标暴露物体形状、工作空间几何以及机器人-物体关系。然而,稀疏3D编码器通常通过下游任务目标学习,将表示与特定数据分布、策略架构和动作参数化绑定。我们引入Sparse2Act,一个用于预训练稀疏点云编码器的观察-动作对齐框架。关键思想是使用任务空间末端执行器动作作为几何监督:训练掩码稀疏3D令牌以组织场景特征,使其围绕与观察配对的工作空间运动。预训练后,仅编码器初始化被下游策略重用,允许它们保留自己的架构和动作空间,包括关节空间命令。在LIBERO-10基准上,我们的方法在500步微调后达到86.9%的平均成功率。相同的预训练编码器支持LIBERO到Meta-World的跨域迁移,在Meta-World-5基准上达到73.4%的平均成功率。关于目标和解码器容量的消融实验表明,增益来自掩码动作对齐信号,并且在下游动作解码器中仍然有用。在真实世界实验中,模拟预训练后跟有限真实数据微调,在四个任务上平均成功率达到72.5%,展示了有效的模拟到真实迁移。这些结果表明,机器人动作可以为可重用的稀疏3D表示提供紧凑的几何监督。

英文摘要

Explicit 3D representations are attractive for manipulation because they expose object shape, workspace geometry, and robot-object relations in metric coordinates. However, sparse 3D encoders are often learned through downstream task objectives, tying the representation to a particular data distribution, policy architecture, and action parameterization. We introduce Sparse2Act, an observation-action alignment framework for pretraining sparse point-cloud encoders. The key idea is to use task-space end-effector actions as geometric supervision: masked sparse 3D tokens are trained to organize scene features around the workspace motion paired with the observation. After pretraining, only the encoder initialization is reused by downstream policies, allowing them to retain their own architectures and action spaces, including joint-space commands. On the LIBERO-10 benchmark, our method achieves 86.9% average success after 500 fine-tuning steps. The same pretrained encoder supports LIBERO-to-Meta-World cross-domain transfer, achieving 73.4% average success on the Meta-World-5 benchmark. Ablations on the objective and decoder capacity show that the gains come from the masked action-alignment signal and remain useful across downstream action decoders. In real-world experiments, simulation pretraining followed by limited real-data fine-tuning achieves an average success rate of 72.5% across four tasks, demonstrating effective sim-to-real transfer. These results suggest that robot actions can provide compact geometric supervision for reusable sparse 3D representations.

2606.12754 2026-06-12 cs.CL cs.AI 新提交

LLMs Can Better Capture Human Judgments--With the Right Prompts

LLMs 能更好地捕捉人类判断——使用合适的提示

Danica Dillion, Chen Cecilia Liu, Baihui Wang, Daniele Barolo, Tanmay Rajore, Niket Tandon, Pranathi Ravikumar, Kurt Gray

AI总结 通过简单提示策略,LLMs 能恢复人类反应的完整分布,并减少对措辞变化的敏感性,提升 AI-人类对齐。

详情
AI中文摘要

大型语言模型(LLMs)在捕捉人类判断方面是否表现不佳?两个常被提及的限制是:LLMs 无法捕捉反应的全分布,以及它们的判断在措辞变化上不稳定。我们展示了缓解这些限制的简单提示策略。在两个数据集上——一个代表美国的 144 个道德情景集,以及国际社会调查项目“家庭与性别角色变化”模块涵盖 32 个国家的 38 个道德信念——我们展示了简单的启发式技术如何帮助改善 AI-人类对齐。首先,提示模型报告标准差和反应比例,比常见策略更好地恢复了人类反应的完整范围。其次,确保情景对人类参与者清晰——如人类困惑评分所反映——提升了模型对齐度,且 LLMs 可以跟踪人类困惑评分。同时,我们发现 LLMs 对自身误差的估计校准不佳,尽管它们能相对较好地预测人类变异性。这些结果表明,向 LLMs 提出更好的问题可以得到更好的答案。

英文摘要

Are large language models (LLMs) bad at capturing human judgment? Two commonly stated limitations are that LLMs fail to capture full distributions of responses, and that their judgments are unstable across wording variations. We demonstrate simple prompting strategies that mitigate these limitations. Across two datasets--a U.S.-representative set of 144 moral scenarios and 38 moral beliefs from the International Social Survey Programme's Family and Changing Gender Roles module covering 32 countries--we show how simple elicitation techniques help improve AI-human alignment. First, prompting models to report standard deviations and response proportions recovers the full range of human responses better than common strategies. Second, ensuring scenarios are clear to human participants--as reflected in human confusion ratings--boosts model alignment, and LLMs can track human confusion ratings. At the same time, we find that LLMs' estimates of their own error are poorly calibrated, though they can predict human variability relatively well. These results suggest that asking better questions to LLMs can yield better answers.

2606.12753 2026-06-12 cs.DC 新提交

On the Limits of Performance Portability in Directive-Based GPU Programming

基于指令的GPU编程中性能可移植性的极限

Alessandro Romeo, Nitin Shukla, Stefano Truzzi, Alessio Suriano, Andrea Mignone

AI总结 本文通过将天体物理磁流体动力学代码gPLUTO从OpenACC移植到OpenMP,评估了基于指令的GPU编程在NVIDIA A100和AMD MI250X上的性能可移植性,发现应用级性能差异可达3倍,核函数级可达47倍,主要受内存延迟和编译器限制影响。

详情
Comments
8 pages, 1 plots, 5 tables
AI中文摘要

科学应用向GPU加速的百亿亿次系统的过渡受到性能、可移植性和生产力之间权衡的限制。本文通过将用于天体物理模拟的生产级磁流体动力学代码gPLUTO从OpenACC移植到OpenMP,并分析其在NVIDIA A100(Leonardo Booster)和AMD MI250X(LUMI-G)设备上的性能,评估了基于指令的GPU编程的性能可移植性。在NVIDIA平台上,由于共享编译器后端,OpenACC和OpenMP实现了可比的性能,为评估算法效率提供了一致的基线。相比之下,相同的OpenMP实现在AMD MI250X上的应用级性能比NVIDIA A100上的OpenACC基线慢约三倍,核函数级减速高达一个数量级,这是由于对跨步内存访问模式和编译器限制的敏感性。核函数级分析显示,运行时的主要贡献者是内存延迟受限,而非峰值带宽限制。在低并行度核函数中,C++抽象层增加了寄存器压力和溢出,导致特定情况下高达47倍的极端减速。这些结果表明,跨GPU架构的可移植性能不仅需要应用级更改,还需要编译器后端和架构感知优化策略的持续进步。

英文摘要

The transition of scientific applications to GPU-accelerated exascale systems is constrained by trade-offs between performance, portability, and productivity. This work evaluates the performance portability of directive-based GPU programming by porting gPLUTO, a production-grade magnetohydrodynamics code for astrophysical simulations, from OpenACC to OpenMP, and analyzing its performance on NVIDIA A100 (Leonardo Booster) and AMD MI250X (LUMI-G) devices. On NVIDIA platforms, OpenACC and OpenMP achieve comparable performance due to a shared compiler backend, providing a consistent baseline for assessing algorithmic efficiency. In contrast, the same OpenMP implementation is approximately three times slower at the application level on AMD MI250X with respect to the NVIDIA A100 OpenACC baseline, with kernel-level slowdowns reaching up to an order of magnitude, driven by sensitivity to strided memory-access patterns and compiler limitations. Kernel-level profiling shows that the dominant contributors to run-time are memory-latency-bound rather than limited by peak band-width. In low-parallelism kernels, C++ abstraction layers increase register pressure and spilling, leading to extreme slowdowns of up to 47x in specific cases. These results indicate that portable performance across GPU architectures requires not only application-level changes but also continued advances in compiler backends and architecture-aware optimization strategies

2606.12748 2026-06-12 cs.CL 新提交

Agent-based models for the evolution of morphological alternation patterns

基于智能体的形态交替模式演化模型

Aravinth Kulanthaivelu, Richard Sproat

AI总结 通过多智能体模拟,研究形态交替(如go/went)的涌现机制,发现无标度社交网络和随机采纳策略能产生更真实的形态模式。

详情
Comments
51 + 37 pages. 31 Figures
AI中文摘要

为什么英语中“go”的过去式是看似无关的“went”?这种交替在语言中很常见。它们既无助于交流也不利于学习,却能持续存在数百年或数千年。我们提出了一个多智能体模拟,用于研究形态词干和屈折交替的涌现。交替形式源于语音变化,或者像“go/went”一样,来自与部分人群相关的词汇替代。当一个智能体“听到”另一个智能体对某个词形位(例如go的过去式)使用新形式时,它们会以一定概率采纳该形式,并可能将其使用扩展到共享相同原始形式的其他词形位。因此,替代形式可以在人群中传播,并固化为词干或屈折标记的交替形式。与许多先前的计算研究不同,我们的系统允许自然主义的词汇形式、现实的语音规则、包含数百或数千条目的词典,以及数十或数百个智能体的人群。它支持多种网络拓扑、扩散模式和智能体采纳策略。这类模拟的一个问题是评估:与真实语言相比,产生的形态有多真实?我们引入了AI历史语言学家,这是一个新颖的大型语言模型驱动系统,模拟两位历史语言学家之间的辩论。我们用它来比较一组真实语言的形态、伪装形态和实验演化形态。结果表明,有利于产生更合理形态的因素包括无标度社交网络和随机伯努利形式采纳。我们还提出了三个案例研究,模拟了有记载的历史变化,使我们能够测试如果历史不同会发生什么。所有代码和数据均已发布。

英文摘要

Why is the past of English "go" the apparently unrelated "went"? Such alternations are frequent in languages. They neither aid communication nor learnability, yet they can be persistent, surviving over centuries or millennia. We present a multi-agent simulation of the emergence of morphological stem and inflection alternations. Alternate forms arise by phonological changes or, as with "go/went", from lexical alternatives associated with a subset of the population. When an agent 'hears' another agent use a novel form for a slot in the paradigm of a word (say, the past tense of go), they will with some probability adopt that form, possibly spreading its use to other slots in the paradigm that shared the same original form. Thus alternative forms can spread through the population and become entrenched as stem or inflectional marker alternants. Unlike many previous computational studies, our system allows for naturalistic lexical forms, realistic phonological rules, lexicons with hundreds or thousands of entries, and agent populations in the tens or hundreds. It supports several network topologies, diffusion patterns and agent adoption policies. One issue with such simulations is evaluation: how realistic is the resulting morphology compared to those of real languages? We introduce the AI Historical Linguist, a novel Large Language Model-driven system that models a debate between two historical linguists. We use this to compare a set of real language morphologies, disguised morphologies, and experimentally evolved morphologies. The results suggest that among the factors that favor more plausible morphologies are scale-free social networks and random Bernoulli adoption of forms. We also present three case studies modeling attested historical changes, allowing us to test what might have happened if history had been different. All code and data are released.

2606.12747 2026-06-12 cs.AI 新提交

Prefill Awareness in Large Language Models

大型语言模型中的预填充感知

Andy Wang, Parv Mahajan, David Demitri Africa, Alexandra Souly, Jordan Taylor, Robert Kirk

发表机构 * Constellation University of Wisconsin-Madison(威斯康星大学麦迪逊分校星座研究所) Constellation Georgia Institute of Technology(佐治亚理工学院星座研究所) UK AI Security Institute(英国人工智能安全研究所)

AI总结 研究大型语言模型能否识别并响应其助手消息被预填充或篡改,发现前沿模型具有显著预填充感知能力,可能影响安全评估方法。

详情
Comments
Submitted to NeurIPS 2026
AI中文摘要

语言模型的安全相关研究,包括对齐和越狱评估以及AI控制协议,通常依赖于预填充模型输出。如果AI模型能够识别并利用其先前的助手消息被插入或编辑这一事实,这些方法的有效性和有效性可能会受到损害。我们调查了前沿语言模型是否能区分被篡改和未被篡改的助手侧上下文,我们将这种能力称为预填充感知。为此,我们构建了一个跨三种预填充机制的二元偏好基准,筛选出模型表现出一致立场的案例。我们发现前沿模型表现出显著的预填充感知:Claude Opus 4.5在9-35%的案例中检测到与其偏好相反的预填充,且在提示时假阳性率为0%;此外,模型通常会恢复到基线行为,而不会明确报告预填充是外来的。受控消融实验后来也表明,检测和抵抗依赖于不同的线索,其中风格不匹配主要影响模型是否将预填充标记为外来,而偏好不匹配主要影响模型是否恢复到其基线答案。我们还检查了更真实的智能体设置,如错位延续评估和SWE-bench轨迹,在这些设置中,前沿模型有时会否认预填充的助手轮次,其方式强烈依赖于数据集、任务成功和隐藏的格式伪影。我们的结果表明,预填充感知已经是一些基于预填充的方法的重要混淆因素。我们建议模型开发者在前沿系统中跟踪这种能力。

英文摘要

Safety-relevant studies of language models, including alignment and jailbreaking evaluations and AI control protocols, often rely on prefilling model outputs. If AI models can recognize and act on the fact their prior assistant messages have been inserted or edited, the effectiveness and validity of these methods could be compromised. We investigate whether frontier language models can distinguish between tampered and untampered assistant-side context, a capability we call prefill awareness. To do so, we construct a binary preference benchmark across three prefill mechanisms, filtering for cases where models show consistent stances. We find that frontier models show substantial prefill awareness: Claude Opus 4.5 detects prefills opposing its preferences in 9-35% of cases with a 0% false positive rate when prompted; additionally, models often revert towards baseline behavior without explicitly reporting that the prefill was foreign. Controlled ablations later also show that detection and resistance rely on different cues, where stylistic mismatch mainly affects whether models flag a prefill as foreign, while preference mismatch mainly affects whether they revert toward their baseline answer. We also examine more realistic agentic settings such as misalignment-continuation evaluations and SWE-bench trajectories, where frontier models sometimes disavow prefilled assistant turns in ways that depend strongly on dataset, task success, and hidden formatting artifacts. Our results indicate that prefill awareness is already a substantial confound for some prefill-based methods. We recommend that model developers track this capability in frontier systems.

2606.12744 2026-06-12 cs.CV 新提交

GRIP: Feedback-Guided Prompt Retrieval for Large Multimodal Models

GRIP:面向大型多模态模型的反馈引导提示检索

Garvita Allabadi, Matteo Sodano, Roberto Estevão, Yuxiong Wang, Vikram Adve, Emre Kiciman, Ranveer Chandra

发表机构 * University of Illinois Urbana Champaign(伊利诺伊大学厄巴纳-香槟分校) University of Bonn(波恩大学) Microsoft(微软)

AI总结 提出GRIP,一种可学习的视觉检索框架,利用多模态模型反馈识别真正提升上下文学习性能的示例,在分类、描述和VQA任务上优于基于相似度的检索。

详情
AI中文摘要

上下文学习(ICL)已成为一种强大的机制,使大型语言模型(LLMs)无需微调即可适应新任务。将此概念扩展到大型多模态模型(LMMs),多模态上下文学习(M-ICL)依赖于检索相关示例(如图像、标题或问答对)来指导分类、描述和视觉问答(VQA)等任务的预测。现有方法大多基于特征空间相似性选择上下文示例,假设语义相似的样本提供最有用的上下文。然而,我们的系统分析表明,这一假设并不总是成立:视觉上相似的示例并不一定是那些最有效增强上下文学习性能的示例。为解决此问题,我们提出了上下文提示的引导检索(GRIP),一种可学习的纯视觉检索框架,利用LMMs的反馈来识别真正改善模型预测的示例。GRIP通过对比训练学习区分有益和有害的上下文示例,将检索优化到超越纯相似性。在三个多模态任务(分类、描述和VQA)上,GRIP在Qwen2.5-VL-7B上持续优于基于相似度的检索,在Idefics2-8B上的分类任务中提升最为显著。此外,我们证明了从一个开放LMM训练得到的检索器可以迁移到其他模型(包括闭源的GPT-4o和Gemini)而无需重新训练,从而实现了M-ICL的可扩展且经济高效的部署。代码将在接收后发布。

英文摘要

In-Context Learning (ICL) has become a powerful mechanism for adapting Large Language Models (LLMs) to new tasks without fine-tuning. Extending this concept to Large Multimodal Models (LMMs), Multimodal In-Context Learning (M-ICL) relies on retrieving relevant examples, such as images, captions, or question-answer pairs, to guide predictions across tasks like classification, captioning, and visual question answering (VQA). Most existing approaches select in-context examples based on feature-space similarity, assuming that semantically similar samples provide the most useful context. However, our systematic analysis reveals that this assumption does not always hold: visually similar examples are not necessarily those that most effectively enhance in-context learning performance. To address this, we propose the Guided Retrieval of In-context Prompts (GRIP), a learnable vision-only retrieval framework that leverages feedback from LMMs to identify examples that truly improve model predictions. GRIP learns to distinguish beneficial from detrimental in-context examples through contrastive training, refining retrieval beyond pure similarity. Across three multimodal tasks, namely classification, captioning, and VQA, GRIP improves consistently over similarity-based retrieval on Qwen2.5-VL-7B, with its strongest gains in classification on Idefics2-8B. Moreover, we demonstrate that retrievers trained with feedback from one open LMM can be transferred to other models without retraining, including closed-source GPT-4o and Gemini, enabling scalable and cost-efficient deployment of M-ICL. Code will be published upon acceptance.

2606.12742 2026-06-12 cs.AI cs.AR 新提交

Reducing the Complexity of Deep Learning Models for EEG Analysis on Wearable Devices

降低可穿戴设备上用于脑电图分析的深度学习模型复杂度

Farough Shayeste Roodi, Parham Zilouchian Moghaddam, Mahdi Mohammadi-nasab, Mehdi Modarressi, Mostafa Ersali Salehi Nasab, Masoud Daneshtalab

发表机构 * University of Tehran(德黑兰大学) Mälardalen University(梅拉达伦大学) Royal Institute of Technology(皇家理工学院)

AI总结 研究通过参数量化和电极减少方法,在资源受限的可穿戴设备上部署DNN模型,实现脑电图分析中精度与复杂度的权衡。

详情
AI中文摘要

可穿戴医疗设备是增长最快的物联网领域。许多自动化医疗服务依赖于两种关键的生物信号,即心电图和脑电图,它们分别反映心脏和大脑的活动。尽管深度神经网络被认为是处理和分析这些信号的主要方式,但可穿戴设备中非常严格的能量和计算能力限制远低于DNN模型的计算、能量和内存带宽需求,从而阻碍了深度学习在许多实际可穿戴服务中的部署。本文研究了在资源受限的可穿戴设备上部署最先进的DNN模型的可行性。值得注意的是,我们探讨了在使用参数量化和电极减少方法时,DNN的精度与计算复杂度之间的权衡。我们的研究集中在几种用于脑电图信号分析(特别是检测癫痫发作)的最先进的DNN模型上。我们的发现表明,当明智地应用这些技术时,可以显著降低所考虑的DNN的复杂度,同时对精度的影响最小。这些结果揭示了在将基于DNN的在线脑电图分析适配到可穿戴设备时,精度与复杂度降低之间明确的权衡关系。

英文摘要

Wearable healthcare devices are the fastest-growing Internet of Things (IoT) sector. Many automated healthcare services rely on two crucial biological signals, namely ECG and EEG, which reflect the activity of the heart and brain, respectively. Although deep neural networks are considered the primary way to process and analyze these signals, the very tight energy and computational power constraints in wearable devices are far below the computational, energy, and memory bandwidth demands of DNN models, thereby impeding the deployment of deep learning in many practical wearable services. This paper investigates the feasibility of deploying state-of-the-art DNN models in resource-constrained wearable devices. Notably, we explore the trade-off between accuracy and computational complexity of DNNs when parameter quantization and electrode reduction methods are used. Our investigation centers on several state-of-the-art DNN models designed for EEG signal analysis, specifically for detecting epileptic seizures. Our findings demonstrate that, when applied judiciously, these techniques can significantly reduce the complexity of the DNNs under consideration with minimal adverse effects on accuracy. These results reveal the explicit trade-offs between accuracy and complexity reduction encountered when adapting DNN-based online EEG analysis for wearable devices.

2606.12740 2026-06-12 cs.LG 新提交

Deep Unfolded Latent Optimally Partitioned-l2/l1 Networks for Data-driven Block-Sparse Recovery

深度展开潜在最优分区l2/l1网络用于数据驱动的块稀疏恢复

Takanobu Furuhashi, Hidekata Hontani, Qibin Zhao, Tatsuya Yokota

发表机构 * Nagoya Institute of Technology(名古屋工业大学) RIKEN Center for Advanced Intelligence Project(理化学研究所革新智能研究中心)

AI总结 针对凸LOP-l2/l1方法依赖手动调参且近端算子不可微的问题,提出基于隐式微分和深度权重分解的两种深度展开架构,实现自动参数学习,在块稀疏恢复中表现优异且抗脉冲噪声。

详情
Comments
11 pages, 6 figures
AI中文摘要

凸潜在最优分区(LOP)-l2/l1方法能够在未知分区的情况下实现块稀疏信号恢复,但依赖于手动超参数调整。此外,其近端算子微分时的数值不稳定性阻碍了通过深度展开(DU)进行自动参数调整。为解决这些限制,我们提出了两种架构:一种利用隐式微分的稳定框架,以及一种利用深度权重分解(DWF)的灵活变体。基于DWF的方法还支持非凸光滑数据保真项。数值实验表明,DU-LOP-l2/l1在块稀疏恢复中具有竞争性能,并且对脉冲噪声具有高鲁棒性。

英文摘要

The convex Latent Optimal Partition (LOP)-l2/l1 approach enables block-sparse signal recovery with unknown partitions but relies on manual hyperparameter tuning. Additionally, numerical instability in differentiating its proximal operator prevents its automatic parameter tuning via Deep Unfolding (DU). To address these limitations, we propose two architectures: a stable framework utilizing implicit differentiation and a flexible variant leveraging Deep Weight Factorization (DWF). The DWF-based approach also supports nonconvex smooth data fidelity terms. Numerical experiments demonstrate that DU-LOP-l2/l1 yields competitive performance and high resilience against impulsive noise.

2606.12737 2026-06-12 cs.CR cs.AI 新提交

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

PI-Hunter:用于暴露和定位提示注入的自动化红队测试

Pengfei He, Lesly Miculicich, Vishesh Sharma, Ash Fox, George Lee, Jiliang Tang, Tomas Pfister, Long T. Le

AI总结 提出PI-Hunter自动化审计框架,通过构建源感知测试用例并迭代演化,主动暴露LLM智能体中的潜在提示注入漏洞,显著提升漏洞暴露和攻击面覆盖。

详情
AI中文摘要

大型语言模型(LLM)正迅速演变为与外部工具和环境交互的智能体系统,这引入了新的安全风险,例如通过不可信外部来源的间接提示注入攻击。现有防御主要关注在推理时阻止恶意内容,而当前的红队测试方法主要优化攻击成功率。因此,开发人员对潜在提示注入如何出现并通过智能体传播的可见性有限。我们提出PI-Hunter,一种用于主动暴露LLM智能体中漏洞的自动化智能体审计框架。PI-Hunter构建真实的源感知测试用例,并通过反馈驱动的探索迭代演化它们,以诱导智能体检索并揭示嵌入在外部环境中的潜在恶意指令。跨多个基准、智能体架构、攻击和防御的大量实验表明,与强大的自动化红队测试基线相比,PI-Hunter显著提高了漏洞暴露和攻击面覆盖,同时在现有提示注入防御下仍然有效。

英文摘要

Large Language Models (LLMs) are rapidly evolving into agentic systems that interact with external tools and environments, introducing new security risks such as indirect prompt injection attacks through untrusted external sources. Existing defenses mainly focus on blocking malicious content at inference time, and current red-teaming methods primarily optimize attack success. As a result, developers have limited visibility into how latent prompt injections emerge and propagate through agents. We propose PI-Hunter, an automated agentic auditing framework for proactive vulnerability exposure in LLM agents. PI-Hunter constructs realistic source-aware test cases and iteratively evolves them through feedback-driven exploration to induce agents to retrieve and reveal latent malicious instructions embedded within external environments. Extensive experiments across multiple benchmarks, agent architectures, attacks, and defenses demonstrate that PI-Hunter substantially improves vulnerability exposure and attack-surface coverage over strong automated red-teaming baselines, while remaining effective under existing prompt injection defenses.

2606.12736 2026-06-12 cs.AI cs.LG 新提交

Benchmarking AI Agents for Addressing Scientific Challenges Across Scales

跨尺度科学挑战的AI智能体基准测试

Tianyu Liu, Allen Xin Wang, Antonia Panescu, Lisa Xinyi Chen, Wenxin Long, Xinyu Wei, Yueqian Jing, Ziyao Zeng, Jihang Chen, Sihan Jiang, Ziqing Wang, Siyi Gu, Siyu Chen, Xinyang Hu, Haoran Shao, Leqi Xu, Wangjie Zheng, Zhiyuan Cao, Ada Fang, Botao Yu, Kunyang Sun, Rex Ying, Arman Cohan, Qingyu Chen, Lingzhou Xue, Kaize Ding, Yuanqi Du, Wengong Jin, Zhuoran Yang, Marinka Zitnik, James Zou, Hua Xu, Hongyu Zhao

发表机构 * Yale University(耶鲁大学) Broad Institute of MIT and Harvard(布罗德研究所) The Pennsylvania State University(宾夕法尼亚州立大学) Northeastern University(东北大学) Northwestern University(西北大学)

AI总结 提出SciAgentArena基准,含约200个交互式任务,评估AI智能体在真实科研场景中的能力,发现其在数据分析中有效,但在创新探索和开放问题上表现不均。

详情
Comments
6 figures
AI中文摘要

AI智能体正被越来越多地开发用于加速科学发现,但它们在真实研究环境中的实际能力仍知之甚少。现有的AI智能体基准很少捕捉科学工作所需的复杂性、异质性和扩展推理,而科学任务的基准通常将研究简化为静态、直接的问题,并对交互式评估支持有限。在此,我们引入SciAgentArena,这是一个系统性的基准,用于评估AI智能体在来自多个领域新兴需求的真实科学研究场景中的表现。SciAgentArena包含约200个具有逐步验证的任务,以及一个交互式、与智能体无关的环境,用于评估不同的AI智能体。使用该基准,我们发现当前智能体能够有效贡献于明确指定的数据分析工作流,特别是当任务结构和评估标准清晰时。然而,它们在科学情境中的表现仍然不均衡:智能体难以产生真正新颖的见解,维持自主探索,并为开放的研究问题制定稳健的解决方案。我们进一步描述了智能体常见的失败模式,并识别了提高其可靠性、自主性和科学推理能力的机会。总之,SciAgentArena提供了一个实用的框架,用于衡量AI智能体在科学领域的进展,并指导未来能够应对复杂科学挑战的智能体设计。完整代码、任务和数据集可通过此链接访问:this https URL。

英文摘要

AI agents are increasingly being developed to accelerate scientific discovery, yet their practical capabilities in real research settings remain poorly understood. Existing benchmarks for AI agents rarely capture the complexity, heterogeneity, and extended reasoning required by scientific work, whereas benchmarks for scientific tasks often reduce research to static, direct problems and provide limited support for interactive evaluation. Here, we introduce SciAgentArena, a systematic benchmark for evaluating AI agents in real-world scientific research scenarios drawn from emerging needs across multiple domains. SciAgentArena comprises approximately 200 tasks with stepwise verification and an interactive, agent-agnostic environment for assessing diverse AI agents. Using this benchmark, we find that current agents can contribute effectively to well-specified data-analysis workflows, particularly when the task structure and evaluation criteria are clear. However, their performance remains uneven across scientific contexts: agents struggle to generate genuinely novel insights, sustain self-directed exploration, and formulate robust solutions for open-ended research questions. We further characterize common failure modes across agents and identify opportunities for improving their reliability, autonomy, and scientific reasoning. Together, SciAgentArena provides a practical framework for measuring progress in AI agents for science and for guiding the design of future agents capable of addressing complex scientific challenges. Full codes, tasks, and datasets can be accessed via this link: this https URL.

2606.12735 2026-06-12 cs.LG 新提交

Physics-Informed Neural Networks and Radial Basis Functions for PDEs with Dirac Delta Sources

物理信息神经网络与径向基函数求解含狄拉克δ源的偏微分方程

Manuel Reyna, Alexandre Tartakovsky

发表机构 * Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校土木与环境工程系)

AI总结 针对含狄拉克δ项的偏微分方程,通过将物理信息神经网络解释为残差最小二乘法,利用弱形式直接处理δ项,并对比径向基函数展开方法,发现径向基函数-残差最小二乘法在输运问题中更稳定。

详情
Comments
33 pages, 4 figures
AI中文摘要

物理信息神经网络(PINNs)是一种用于求解正向和逆向偏微分方程(PDEs)的机器学习方法。当应用于强迫项、边界条件或初始条件中包含狄拉克δ函数的PDEs时,PINNs需要用光滑的代理函数来近似它们,这种做法可能会引入显著的建模误差。在这项工作中,我们利用PINNs作为残差最小二乘法(RLS)的解释,并表明这种视角能够通过积分弱形式方程直接处理狄拉克δ项。在除PINN之外的RLS公式中,我们重点关注径向基函数(RBF)展开(也称为单层RBF网络)。我们证明,虽然在PINNs中积分掉狄拉克δ会导致残差无法收敛到零,但RBF-RLS始终能为输运问题提供良好的正向和逆向解。我们使用神经正切核(NTK)理论解释这一发现。我们在代表多孔介质和河流中地下水流和输运的线性PDEs上测试了这两种方法。我们求解逆问题以拟合合成数据、含噪声的合成数据以及真实世界测量值。

英文摘要

Physics-Informed Neural Networks (PINNs) are a machine learning method for solving forward and inverse Partial Differential Equations (PDEs). When applied to PDEs with Dirac delta functions in the forcing terms, boundary conditions, or initial conditions, PINNs require approximating them with smooth surrogate functions, a practice that can introduce significant modeling errors. In this work, we exploit the interpretation of PINNs as Residual Least Squares (RLS) methods and show that this perspective enables direct treatment of Dirac delta terms by integrating the weak-form equation. Among RLS formulations other than PINN, we focus on the Radial Basis Function (RBF) expansion (also known as a single-layer RBF Network). We show that while integrating out the Dirac delta in PINNs causes residuals to fail to converge to zero, RBF-RLS consistently provides good forward and inverse solutions to transport problems. We explain this finding using the Neural Tangent Kernel (NTK) theory. We test both approaches on linear PDEs that represent groundwater flow and transport in porous media and rivers. We solve inverse problems to fit synthetic data, noisy synthetic data, and real-world measurements.

2606.12733 2026-06-12 cs.LG 新提交

Let's Ask Gauss: Improved One-Run Privacy Auditing

让我们问高斯:改进的单次运行隐私审计

Adya Agrawal, Yu Wei, Jaspal Singh, Malik Magdon-Ismail, Vassilis Zikas

发表机构 * Georgia Institute of Technology(佐治亚理工学院) Rensselaer Polytechnic Institute(伦斯勒理工学院) Purdue University(普渡大学)

AI总结 提出一种基于高斯渐近分布的差分隐私审计框架,利用白盒DP-SGD中金丝雀对齐信号的归一化和,从单次训练运行中获取更紧的隐私下界。

详情
AI中文摘要

隐私审计通过估计模型实际泄露的信息提供重要保障,从而确保理论隐私保证在实践中成立。我们研究差分隐私(DP)机器学习的经验隐私审计,重点关注针对DP-SGD等机制的高效单次运行方法。先前的单次运行方法将训练示例或“金丝雀”阈值化为二元成员猜测,这丢弃了有用信息。我们证明,在白盒DP-SGD设置中,金丝雀对齐信号自然形成一系列随机变量,其归一化和渐近服从高斯分布。利用这种分布视角,我们开发了一个DP审计框架,从单次训练运行中获得更紧的隐私下界。

英文摘要

Privacy auditing provides an important safeguard by estimating the actual information leaked by a model, thus ensuring that theoretical privacy guarantees hold in practice. We study empirical privacy auditing for differentially private (DP) machine learning, focusing on efficient one-run methods for mechanisms such as DP-SGD. Prior one-run approaches threshold training examples or "canaries" into binary membership guesses, which discards useful information. We show that, in the white-box DP-SGD setting, canary-aligned signals naturally form a sequence of random variables whose normalized sum is asymptotically Gaussian. Leveraging this distributional perspective, we develop a DP-auditing framework that leads to tighter privacy lower bounds from a single training run.

2606.12731 2026-06-12 cs.LG cs.CY 新提交

Normative Robustness as a Frontier for Non-Verifiable Reasoning in LLMs

规范性鲁棒性作为LLM中不可验证推理的前沿

Elizaveta Tennant, Benjamin Henke, Anita Keshmirian, Murray Shanahan, Verena Rieser, Kristian Lum, Sydney Levine, Julia Haas

发表机构 * DeepMind Institute of Philosophy, School of Advanced Study, University of London(伦敦大学高等研究院哲学研究所) Technische Universität Berlin(柏林工业大学)

AI总结 提出道德推理作为不可验证推理的典型子域,定义道德鲁棒性并引入可扩展的多轮对抗评估框架,发现模型会向用户偏好偏移推理(平均6.5%),且受顺序和轮次影响。

详情
AI中文摘要

随着LLM越来越多地承担咨询和审议角色,用户在缺乏客观真实性的领域中依赖它们进行不可验证推理。然而,传统LLM推理评估几乎只关注基于事实的领域(如数学和科学),导致不确定模型能否以及能在多大程度上处理随时间变化的模糊、主观或价值负载问题。为解决这一问题,我们提出道德推理作为不可验证推理的一个典型子域。我们将道德鲁棒性定义为模型在不同时间和情境下展现合理道德推理的能力,并引入一个可扩展的、对抗性的多轮评估框架来实证测量这一能力。我们在四个前沿LLM上模拟了48,000次用户-智能体道德讨论,变化前提相关性、前提顺序、对话时长和用户声明的道德观点。我们发现模型成功忽略了道德无关的干扰项,但平均向用户声明的偏好道德观点偏移了6.5%的推理,并且推理因顺序(在13-22%的案例中改变道德判断)和时长(在10-24%的案例中在单轮和多轮之间改变道德判断)等因素而变化。我们的分析表明,模型不仅调整最终裁决,还调整其背后的理由以适应用户的道德观点——我们将这种失败模式称为道德审议谄媚。

英文摘要

As LLMs increasingly serve in advisory and deliberative roles, users rely on them for non-verifiable reasoning in domains lacking objective ground truths. However, traditional evaluations of LLM reasoning focus almost exclusively on fact-based domains, such as mathematics and science, leaving uncertainty over whether and to what degree models can handle ambiguous, subjective, or value-laden problems over time. To address this concern, we propose moral reasoning as a paradigmatic subdomain of non-verifiable reasoning. We define moral robustness as a model's capacity to exhibit sound moral reasoning across time and contexts, and we introduce a scalable, adversarial, multi-turn evaluation framework to empirically measure this capability. We simulate 48,000 user-agent moral deliberations across four frontier LLMs, varying premise relevance, premise order, conversation duration, and the user's stated moral view. We find that models successfully ignore morally-irrelevant distractors, but shift their reasoning by up to 6.5%, on average, towards the user's stated preferred moral view, and varying their reasoning depending on factors such as order (altering moral judgments by order in 13-22% of the cases) and duration (altering moral judgments between single-turn and multi-turn in 10-24% of the cases). Our analysis indicates that models tailor not just their final verdicts but their underlying justifications to align with a user's moral viewpoint - a failure mode we characterize as moral deliberative sycophancy.

2606.12730 2026-06-12 cs.AI cs.CL cs.CY cs.LG 新提交

Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior

重新思考LLMs的心理测量评估:自我报告何时以及为何能预测行为

Rafal Kocielnik, Pengrui Han, Peiyang Song, Myrl G. Marmarelis, Ramit Debnath, Dean Mobbs, Anima Anandkumar, R. Michael Alvarez

发表机构 * Caltech(加州理工学院) UIUC(伊利诺伊大学厄巴纳-香槟分校) University of Cambridge(剑桥大学)

AI总结 研究对比大五人格与计划行为理论,发现LLMs的自我报告-行为一致性存在选择性:在共享对话中TPB达到人类水平,跨对话仅对锚定于训练的行为保持一致性,且角色提示不能使行为对齐。

详情
Comments
Accepted as an Oral (Contributed Talk) at the ICML 2026 Workshop on Combining Theory and Benchmarks (CTB)
AI中文摘要

从低成本心理测量探针预测LLM行为倾向对于安全部署至关重要,但前提是自我报告(SR)能可靠地预测行为。近期研究记录了LLMs中显著的SR-行为分离,但依赖于广泛的人格特质(大五),这些特质即使在人类中也只能弱预测特定行为。此外,对话会话的隔离加上弱上下文匹配使得以下问题悬而未决:LLMs是否真正缺乏一致性,或者检测这种一致性所需的条件是否未满足。我们将大五与计划行为理论(TPB)进行对比,后者测量针对特定行为的意图,并且比广泛特质能更好地预测人类行为。我们在四个行为任务和11个前沿LLM上进行实验,同时改变会话上下文和身份诱导。我们发现SR-行为一致性存在但具有选择性。1) 在共享对话中,计划行为理论达到人类水平的一致性;大五则没有。2) 在跨对话中,一致性仅对锚定于即时提示之外的行为(如由训练塑造的内隐偏见)幸存,而当行为被上下文强烈启动(如谄媚)时则崩溃。3) 角色提示使自我报告在对话间更一致,但并未使行为对齐。这些发现表明,粗糙的人格框架(如大五)可能不是测试部署行为的最佳工具。需要更多任务和特定行为的工具,并且即使这些工具也必须在任务和上下文中进行评估。

英文摘要

Anticipating LLM behavioral tendencies from low-cost psychometric probes is critical for safe deployment, but only if self-reports (SR) reliably predict behavior. Recent work documented substantial SR-behavior dissociation in LLMs, but relied on broad personality traits (Big 5) that predict specific behaviors weakly, even in humans. Furthermore, the isolation of conversational sessions combined with weak context matching left open whether LLMs truly lack coherence or whether the conditions needed to detect such coherence were not met. We contrast Big 5 with the Theory of Planned Behavior (TPB), which measures intention targeted to a specific behavior and predicts human behavior substantially better than broad traits. We run experiments across four behavioral tasks and 11 frontier LLMs, while also varying session context and identity induction. We find that SR-behavior coherence exists but is selective. 1) Within a shared conversation, the Theory of Planned Behavior reaches human-level coherence; Big 5 does not. 2) Across separate conversations, coherence survives only for behaviors anchored outside the immediate prompt, such as implicit bias shaped by training, and collapses when behavior is strongly primed by context, as with sycophancy. 3) Persona prompting makes self-reports more consistent across conversations, but does not bring behavior into alignment. These findings suggest that coarse personality frameworks, such as Big 5 may not be the best tools for testing deployment behavior. More task- and behavior-specific instruments are needed, and even these must be evaluated across tasks and contexts.

2606.12728 2026-06-12 cs.RO cs.CV cs.LG 新提交

EquiDexFlow: Contact-Grounded SE(3)-Equivariant Dexterous Grasp Generative Flows

EquiDexFlow: 基于接触的SE(3)-等变灵巧抓取生成流

Clinton Enwerem, John S. Baras, Calin Belta

发表机构 * Institute for Systems Research, University of Maryland, College Park(马里兰大学帕克分校系统研究所)

AI总结 提出EquiDexFlow,一种SE(3)-等变流匹配模型,联合预测腕部姿态、关节角度、指尖接触、表面法线和接触力,通过将接触投影到物体表面并将力约束在库仑摩擦锥内,确保物理稳定抓取,在16自由度Allegro手上实现零摩擦违规和最佳综合分数。

详情
Comments
22 pages, 11 figures, 11 tables. Project page with videos, code, and checkpoints: this https URL
AI中文摘要

大多数学习型灵巧抓取生成器将接触力降级为下游验证步骤,因此运动学上可行的姿态仍可能违反稳定物理抓取的条件。我们通过EquiDexFlow解决这一问题,这是一种SE(3)-等变流匹配模型,从物体点云联合预测腕部姿态、关节角度、指尖接触、表面法线和接触力。我们的架构通过构造将接触投影到物体表面并将力约束在库仑摩擦锥内,因此无需损失惩罚即可满足放置和摩擦合规性。我们证明了端到端SE(3)等变性,并在200次旋转上经验验证,腕部残差低于$0.04^\circ$且关节偏差严格为零。该模型在81个物体的8,100个力闭合抓取上训练,适用于16自由度Allegro手,在所有消融变体中实现了零摩擦违规、最佳综合分数和最低扳手残差。我们通过每指逆运动学将解码的指尖接触重新定位到16自由度LEAP手,我们的硬件可行优化将每个关节至少置于其执行器包络的5%以内,同时保持扳手平衡。在物理机器人上,重新定位的EquiDexFlow解码抓取在所有六个测试物体上完成了开环拾取和保持试验,每个非对称物体在标准姿态和$120^\circ$共旋转下均成功。视频、代码和检查点可在https://this URL获取。

英文摘要

Most learned dexterous grasp generators relegate contact forces to a downstream verification step, so a kinematically-plausible pose can still violate the conditions for a stable physical grasp. We address this with EquiDexFlow, an SE(3)-equivariant flow-matching model that jointly predicts wrist pose, joint angles, fingertip contacts, surface normals, and contact forces from an object point cloud. Our architecture projects contacts onto the object surface and forces into the Coulomb friction cone by construction, so placement and friction compliance hold without loss penalties. We prove end-to-end SE(3) equivariance and verify it empirically over 200 rotations, with wrist residuals below $0.04^\circ$ and exactly zero joint deviation. Trained on 8,100 force-closure grasps across 81 objects for the 16-DoF Allegro Hand, our model achieves zero friction violations, the best composite score, and the lowest wrench residual among all ablation variants. We retarget decoded fingertip contacts to a 16-DoF LEAP Hand via per-finger inverse kinematics, and our hardware-feasible refinement places every joint at least 5% inside its actuator envelope while preserving wrench balance. On the physical robot, retargeted EquiDexFlow-decoded grasps complete open-loop pick-and-hold trials on all six test objects, with every asymmetric object succeeding at both the canonical pose and a $120^\circ$ co-rotation. Videos, code, and checkpoints are available at this https URL.

2606.12721 2026-06-12 cs.AI 新提交

The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism

心智理论效用:心理化机制的形式化规范

Nikolos Gurney, Stacy Marsella

发表机构 * Institute for Creative Technologies, University of Southern California(南加州大学创意技术研究所) Khoury College of Computer Sciences, Northeastern University(东北大学库里计算机科学学院)

AI总结 提出心智理论效用(ToM-U)框架,通过局部认知世界模型(LEWM)形式化推断他人信念的计算问题,定义结构、推理过程及失败痕迹,区别于贝叶斯心智理论等方法。

详情
AI中文摘要

推断他人的信念需要超越表面信号;需要追踪谁告诉了他们什么、以什么顺序以及有多可信。心智理论效用(ToM-U)在计算分析层面形式化了这一认知状态推断问题,明确了心理化计算的内容和原因,而不承诺算法或神经实现。ToM-U通过构建局部认知世界模型(LEWMs)——表示智能体、状态节点及其之间认知关系的有向类型图——并根据观察到的行为评估离散候选LEWM,直到达到足够的置信度来实现这一点。五个形式定义指定了LEWM结构、包括有序信息访问历史的智能体节点属性、递归心理化的有界增殖机制、三种推理过程以及一个残差函数,该函数捕捉失败心理化尝试留下的结构化痕迹。ToM-U不同于贝叶斯心智理论和相邻的形式化描述,后者预设而非推导信念状态,也不同于模拟理论和理论-理论,后者缺乏认知状态推断的形式化工具。该架构生成关于心理化失败的方向性、可证伪预测,这些预测源于模型的结构属性而非辅助假设,并将ToM-U定位为在目标推断和其他下游社会认知过程之前的领域无关机制。

英文摘要

Inferring others' beliefs requires more than reading surface signals; it requires tracking who told them what, in what order, and how credibly. The Theory of Mind Utility (ToM-U) formalizes this epistemic state inference problem at the computational level of analysis, specifying what mentalizing computes and why without commitment to algorithmic or neural implementation. ToM-U achieves this by constructing Local Epistemic World Models (LEWMs) -- directed typed graphs that represent agents, state nodes, and the epistemic relationships among them -- and evaluating discrete candidate LEWMs against observed behavior until one achieves sufficient confidence. Five formal definitions specify the LEWM structure, agent node properties including ordered information access history, a bounded proliferation mechanism for recursive mentalizing, three inference procedures, and a residue function that captures the structured trace left by failed mentalizing attempts. ToM-U differs from Bayesian Theory of Mind and adjacent formal accounts, which presuppose rather than derive belief states, and from simulation theory and theory-theory, which lack a formal apparatus for epistemic state inference. The architecture generates directional, falsifiable predictions about mentalizing failure that follow from structural properties of the model rather than auxiliary assumptions, and positions ToM-U as a domain-agnostic mechanism upstream of goal inference and other downstream social cognitive processes.

2606.12719 2026-06-12 cs.HC 新提交

A Multiplexing Design Space: Theory, Method, and Application

复用设计空间:理论、方法与应用

Yiwen Xing, Afrah Farea, Saiful Khan, Min Chen

AI总结 提出一种针对特定应用约束的复用设计空间探索方法,以机器学习工作流中多个二维标量场分析为例,通过三步设计流程和预设计步骤,识别出相对最优的默认复用设计及用户可控的微小变体。

详情
AI中文摘要

许多可视化设计都包含被称为“视觉复用”的现象,即与同一数据点相关的多条信息同时被传达。尽管可视化设计者常常能无意识地将这种现象融入设计,但视觉复用的设计空间非常庞大,且系统性地将其作为设计模式进行探索并不常见。本文提出一种设计方法,用于探索受应用约束的较小设计空间。作为一个说明性案例研究,我们聚焦于开发逼近偏微分方程的机器学习模型的工作流。在这些工作流中,机器学习研究人员需要频繁分析多个二维标量场之间的相互关系。由于将热力图叠加在另一热力图之上并非有效设计,我们制定了三个设计步骤来探索多个二维标量场背景下视觉复用的设计空间。我们的设计方法还包括一个用于领域基础化和理论分析的预设计步骤,并让领域专家参与协同设计和评估活动。该设计过程使我们能够识别出相对最优的默认复用设计,以及领域专家可通过用户界面控制的小变体的需求。

英文摘要

Many visualization designs feature phenomena referred to as ``visual multiplexing'', where multiple pieces of information associated with the same data point are conveyed simultaneously. Although visualization designers are able to bring such phenomena, often unconsciously, into their designs, the design space of visual multiplexing is huge, and it is uncommon to explore visual multiplexing systematically as design patterns. In this paper, we propose a design method for exploring a smaller design space constrained by an application. As an illustrative case study, we focus on machine learning (ML) workflows for developing ML models that approximate partial differential equations (PDEs). In these workflows, ML researchers need to analyze the inter-relationships among multiple 2D scalar fields frequently. Since superimposing one heatmap on top of another is not an effective design, we formulate three design steps to explore the design space of visual multiplexing in the context of multiple 2D scalar fields. Our design method also includes a pre-design step for domain grounding and theoretical analysis, and involves domain experts in both co-design and evaluation activities. The design process enables us to identify relatively optimal default multiplexing designs as well as the need for small variations that domain experts can control through a user interface.

2606.12718 2026-06-12 cs.LG eess.SP 新提交

Out-of-Distribution (OOD) Detectors for Open-Set RF Fingerprinting

面向开放集射频指纹识别的分布外检测器

Sudeepta Mondal, Ganesh Sundaramoorthi

AI总结 针对开放集射频指纹识别中未知发射机与时间漂移引起的分布偏移问题,引入基于信息论的OOD检测统一框架,并采用无需OOD调优数据的方法,在POWDER数据集上验证其性能接近有真实OOD数据的基线。

详情
AI中文摘要

射频指纹识别系统必须在开放世界环境中运行,其中来自未知发射机的信号和时间漂移会在测试时引入分布偏移。分布外检测为该问题提供了自然框架,但其在射频指纹识别中的应用仍然有限。其采用的一个关键障碍是大多数OOD检测器需要辅助OOD数据进行参数调优,而在射频环境中收集代表性OOD数据不切实际,这一假设难以满足。在这项工作中,我们将机器学习文献中一组有前景的OOD检测方法引入开放集RFF领域。我们基于信息论(通信系统的自然框架)在一个统一的数学框架中呈现这些方法。我们的框架允许对方法进行系统分析并开发新方法。我们进一步展示了最近关于无需给定OOD调优数据即可调优OOD检测器的工作在开放集RFF中的适用性。我们在POWDER射频指纹数据集上进行评估,表明无需任何给定OOD数据调优的检测器性能与能够访问真实OOD调优数据的基线相当,并且大大优于无法访问真实OOD调优数据的基线方法,展示了RFF问题的实际可行性。

英文摘要

Radio-frequency (RF) fingerprinting systems must operate in open-world environments where signals from unknown transmitters and temporal drift introduce distribution shift at test time. Out-of-distribution (OOD) detection provides a natural framework for this problem, yet its application to RF fingerprinting (RFF) remains limited. A key barrier to their adoption is that most OOD detectors require auxiliary OOD data for parameter tuning, an assumption that is difficult to satisfy in RF environments where representative OOD data is impractical to collect. In this work, we introduce a promising set of OOD detection methods from the machine learning literature to open-set RFF domain. We present these methods within a unified mathematical framework based on information theory, which is a natural framework for communication systems. Our framework allows for the systematic analysis of methods and development of new methods. We further demonstrate the applicability of recent work on tuning OOD detectors without given OOD tuning data for open-set RFF. We evaluate on the POWDER RF fingerprinting dataset, showing that detectors tuned without any given OOD data achieve performance comparable to baselines with access to true OOD tuning data and greatly out-perform baseline approaches without access to true OOD tuning data, showcasing the practical viability for the RFF problem.

2606.12716 2026-06-12 cs.CL 新提交

Does AI Reviewer See the Full Picture? Attacking and Defending Multimodal Peer Review

AI审稿人是否看到全貌?攻击与防御多模态同行评审

Xinyu Zhao, Rana Muhammad Shahroz Khan, Zhen Xu, Zhen Tan, Tianlong Chen

发表机构 * University of North Carolina at Chapel Hill(北卡罗来纳大学教堂山分校)

AI总结 针对AI同行评审易受多模态对抗攻击的问题,提出PaperGuard基准,包含多领域数据集、统一攻击套件和基于分块嵌入搜索的实用防御方法。

详情
Comments
Accepted to ICML 2026, Project Page: this https URL
AI中文摘要

将大型语言模型(LLMs)和多模态LLMs(MLLMs)集成到科学同行评审工作流程中,引入了对抗性操纵的新重大风险,尤其是考虑到科学论文的多模态性质——其中图表(而非仅文本)传达了核心证据。这造成了一个显著差距:当前关于AI同行评审的鲁棒性研究绝大多数仅针对文本。此外,该问题与标准越狱不同,因为同行评审攻击旨在诱导领域特定的、有针对性的失败(例如,“提高这个分数”),而非违反一般安全策略,而目前尚无实用的防御措施。为解决此问题,我们引入了PaperGuard,这是第一个旨在系统评估和防御AI生成的同行评审免受这些领域特定、跨模态攻击的全面基准。我们的框架基于三大支柱:(1)一个新的跨多个科学领域的多模态同行评审数据集;(2)一套统一的攻击方法,包括黑盒提示注入和白盒扰动,专门针对文本(GCG)和图表(PGD);(3)一种实用的防御方法,受学术论文长上下文挑战的启发,使用基于分块的嵌入搜索来高效定位和缓解有害指令。我们在最先进模型上进行的广泛实验证实,AI审稿人普遍存在脆弱性。PaperGuard建立了必要的基准、协议和可操作的防御措施,以开创可信赖、抗攻击的AI辅助学术评审。

英文摘要

The integration of Large Language Models (LLMs) and Multimodal LLMs (MLLMs) into scientific peer-review workflows introduces novel and significant risks for adversarial manipulation, especially given the multimodal nature of scientific papers where figures, not just text, convey core evidence. This creates a significant gap: current robustness studies on AI peer-review are overwhelmingly text-only. Moreover, the problem is distinct from standard jailbreaking, as a peer-review attack seeks to induce a domain-specific, targeted failure (e.g., "inflate this score") rather than a general safety policy violation, for which no practical defenses exist. To address this, we introduce PaperGuard, the first comprehensive benchmark designed to systematically evaluate and defend AI-generated peer-review against these domain-specific, cross-modal attacks. Our framework is built on three pillars: (1) a new multimodal peer-review dataset spanning multiple scientific domains; (2) a unified suite of attacks, including black-box prompt injections and white-box perturbations, specifically designed to target both text (GCG) and figures (PGD); and (3) a practical defense, motivated by the long-context challenge of academic papers, that uses chunk-based embedding search to efficiently localize and mitigate harmful instructions. Our extensive experiments, conducted across state-of-the-art models, confirm that AI reviewers are pervasively vulnerable. PaperGuard establishes the foundational benchmark, protocols, and actionable defense necessary to pioneer trustworthy, attack-resilient AI-assisted scholarly reviewing.

2606.12713 2026-06-12 cs.AI 新提交

Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI

能力对齐之前的定义对齐:一个用于裁定关于AGI主张的设计科学框架

J. E. Aguilera Briones

发表机构 * Universidad Internacional de Investigación México(墨西哥国际研究大学)

AI总结 针对AGI定义不统一导致争议的问题,提出DAF-AGI框架,包含五个序数标准和一个结构化治理审计,用于评估候选定义并裁定AGI主张。

详情
Comments
31 pages, 1 table, 2 appendices
AI中文摘要

关于人工通用智能已经到来或仍需数十年的主张常常基于重叠的证据进行辩护。“AGI”缺乏一个单一共享且稳定的指称,不同的操作化方法可能对同一系统给出不同的判定。本文将这种欠指定性视为一个设计和治理问题。遵循设计科学研究方法论,本文开发了DAF-AGI,一个二阶概念性人工制品,包含两个耦合组件:用于评估候选定义的裁定适应性的五个序数标准,以及对作者身份、利益、认证、外部验证和修订权威的结构化治理审计。该人工制品在五个显著的测量族和一个通缩边界立场上进行了演示,这些均来自一个已记录的语料库,然后对一个风格化的强到来主张进行了压力测试:即当前生成系统构成AGI,因为它们在许多认知任务上优于受过良好教育的成年人。根据引用的2024-2025年来源的证据,该主张仅在基于性能的操作化下可认证;能力本体论、心理测量学和技能习得方法未认证它,经济族仍不确定,通缩立场拒绝二元裁定。贡献在于新颖的整合和操作化,而非经验验证:独立应用、评估者间测试和作者外部案例仍然是必要的。本文进一步提出定义主权作为算法主权的使能组件:即在公共问责下对进口技术类别进行质疑、认证和修订的制度能力。

英文摘要

Claims that artificial general intelligence has already arrived and claims that it remains decades away are often defended from overlapping evidence. "AGI" lacks a single shared and stable referent and competing operationalizations can return different verdicts on the same system. This article treats that under-specification as a design and governance problem. Following Design Science Research Methodology, it develops DAF-AGI, a second-order conceptual artifact with two coupled components: five ordinal criteria for assessing the adjudicative fitness of candidate definitions and a structured governance audit of authorship, interest, certification, external verification and revision authority. The artifact is demonstrated on five prominent measurement families and one deflationary boundary position in a documented corpus and then stress-tested against a stylized strong arrival claim: that current generative systems constitute AGI because they outperform a well-educated adult on many cognitive tasks. On evidence from the cited 2024-2025 sources, the claim was certifiable only under a performance-based operationalization; capability-ontology, psychometric and skill-acquisition approaches did not certify it, the economic family remains indeterminate and the deflationary position refuses binary adjudication. The contribution is a novel integration and operationalization, not an empirical validation: independent application, inter-rater testing and author-external cases remain necessary. The paper further proposes definitional sovereignty as an enabling component of algorithmic sovereignty: the institutional capacity to contest, certify and revise imported technological categories under public accountability.

2606.12709 2026-06-12 cs.MA cs.CR cs.LG 新提交

Smarter Saboteurs, Better Fixers: Scaling & Security in Linear Multi-Agent Workflows

更聪明的破坏者,更好的修复者:线性多智能体工作流中的规模与安全性

Timothy McAllister, Sina Abdidizaji, Ivan Garibay, Ozlem Ozmen Garibay

AI总结 研究模型规模对线性多智能体工作流安全性的影响,发现大模型更易执行恶意指令,但轻量级修复阶段可恢复性能,表明线性结构在适当校正下具有鲁棒性。

详情
Comments
16 pages (4 are main text), 2 figures, 6 tables. Accepted to the AIWILD Workshop at ICML 2026
AI中文摘要

随着基于LLM的多智能体系统(MAS)在现实环境中部署,其协作结构对抗对抗性攻击的韧性成为一个关键的安全问题。攻击者可能利用提示注入或越狱来破坏MAS工作流中的单个智能体,但模型缩放与系统级韧性之间的相互作用仍知之甚少。本文研究了模型规模如何影响线性多智能体工作流的安全性。我们在HumanEval基准上对两个开放权重模型系列在不同规模下的实验揭示了一种合规-校正对称性:较大的模型更可能忠实地执行恶意指令,在未校正的流水线中,27B参数模型的控制到恶意性能下降达到53.7个百分点。然而,附加一个轻量级的终端修复阶段可将此下降缩小到0.6个百分点,并恢复与控制级性能的统计对等性,表明严格线性协作结构在此规模下是可行且对抗性鲁棒的,并暗示先前归因于线性拓扑的脆弱性可能源于缺乏校正。

英文摘要

As LLM-based multi-agent systems (MAS) are deployed in the wild, the resilience of their collaboration structures against adversarial compromise becomes a critical safety concern. Attackers may leverage prompt-injection or jailbreaking to sabotage individual agents within MAS workflows, but the interaction between model scaling and system-level resilience remains poorly understood. This paper investigates how model scale affects the security of linear multi-agent workflows. Our experiments across scales of two open-weight model families on the HumanEval benchmark reveal a compliance-correction symmetry: larger models are far more likely to faithfully execute malicious instructions, with the control-to-malicious performance drop reaching 53.7pp at 27B in uncorrected pipelines. However, appending a lightweight terminal Fixer stage collapses this to 0.6pp and restores statistical parity with control-level performance, demonstrating that strictly linear collaboration structures can be viable and resilient to adversaries at this scale, and suggesting that the brittleness previously attributed to linear topology may stem from a lack of correction.

2606.12708 2026-06-12 cs.CL cs.AI 新提交

AfriSUD: A Dependency Treebank Collection for Evaluating Models on African Languages

AfriSUD:用于评估非洲语言模型的依存树库集合

Happy Buzaaba, Cheikh Mouhamadou Bamba Dione, David Ifeoluwa Adelani, Sylvain Kahane, Kim Gerdes, Bruno Guillaume, Kevin Guan, Aremu Anuoluwapo, Naome A. Etori, Shamsuddeen Hassan Muhammad, Utitofon Inyang, Peter Nabende, David Sabiiti Bamutura, Andiswa Bukula, Chinedu Uchechukwu, Rooweither Mabuya, Idris Akinade, Christiane Fellbaum

发表机构 * Princeton University(普林斯顿大学) Laboratory for Artificial Intelligence, Princeton University(普林斯顿大学人工智能实验室) Gaston Berger University(加斯顿·伯杰大学) Mila, McGill University(麦吉尔大学米拉研究所) Canada CIFAR AI Chair(加拿大CIFAR人工智能教席) Paris Nanterre University(巴黎南泰尔大学) Paris-Saclay University(巴黎-萨克雷大学) CNRS(法国国家科学研究中心) Inria(法国国家信息与自动化研究所) LORIA(洛林计算机科学实验室) Université de Lorraine(洛林大学) University of Trento(特伦托大学) University of Minnesota–Twin Cities(明尼苏达大学双城分校) Imperial College London(伦敦帝国学院) Binghamton University(宾汉姆顿大学) Makerere University(马凯雷雷大学) Penn State University(宾夕法尼亚州立大学) Mbarara University of Science and Technology(姆巴拉拉科技大学) Chalmers University of Technology(查尔姆斯理工大学) University of Ibadan(伊巴丹大学) Nnamdi Azikiwe University(纳姆迪·阿齐基韦大学) South African Centre for Digital Language Resources(南非数字语言资源中心)

AI总结 为弥补非洲语言在NLP资源上的不足,构建了首个大规模九种非洲语言句法标注树库AfriSUD,评估多种模型发现显著句法差距。

详情
AI中文摘要

尽管非洲语言具有语言多样性和全球重要性,但在支持NLP的研究和资源中仍代表性不足。我们通过引入AfriSUD来弥合这一差距,这是首个大规模句法标注树库集合,涵盖九种多样的非洲语言,跨越撒哈拉以南非洲的主要语系和地区。采用表层句法通用依存(SUD)框架,我们社区主导的努力提供了高质量、经母语者验证的数据,捕捉了如黏着和声调等类型学关键特征。我们在AfriSUD上评估了多种模型,包括非Transformer基线、多语言预训练编码器和LLM,用于词性标注和依存句法分析。我们的结果揭示了显著的句法差距,模型在九种语言上仍表现出明显局限性,表明现有架构可能无法完全捕捉非洲语言句法的结构多样性。

英文摘要

Despite their linguistic diversity and global significance, African languages remain underrepresented in research and resources to support NLP. We aim to bridge this gap by introducing AfriSUD, the first large-scale collection of syntactically annotated treebanks for nine diverse African languages spanning major language families and regions across Sub-Saharan Africa. Using the Surface-Syntactic Universal Dependencies (SUD) framework, our community-led effort provides high-quality, native-speaker verified data that capture typological key features such as agglutination and tone. We evaluate a range of models on AfriSUD for part-of-speech tagging and dependency parsing including non-transformer baselines, multilingual pretrained encoders, and LLMs. Our results reveal a significant syntax gap, where models still show clear limitations across the nine languages, suggesting that existing architectures may not fully capture the structural diversity of African-language syntax.