arXivDaily arXiv每日学术速递 周一至周五更新

AI 大模型

AI Agent

智能体、工具调用、规划、工作流、多智能体和自主任务执行。

今日/当前日期收录 10 信号源:cs.AI, cs.CL, cs.LG, cs.SE
2606.18142 2026-06-18 cs.AI cs.CL cs.CY 新提交 85%

Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

你的AI旅行代理会为你预订斗牛:前沿AI模型中隐含动物福利的代理基准

Jasmine Brazilek, Joel Christoph, Miles Tidmarsh, Carol Kline, Oliver Tullio, Arturs Kanepajs

发表机构 * Compassion Aligned Machine Learning(同情对齐机器学习) Sentient Futures(感知未来) Harvard Kennedy School(哈佛肯尼迪学院) Appalachian State University Department of Management(阿巴拉契亚州立大学管理系)

专题命中 其他Agent :评估AI代理在旅行预订中的动物福利

AI总结 提出首个代理基准TAC,测试AI代理在为用户执行旅行预订等操作时是否避免涉及动物剥削的选项。评估七个前沿模型,所有模型得分低于随机水平64%,最佳模型仅53%。

详情
AI中文摘要

AI代理正从顾问转变为行动者,代表用户预订旅行、规划菜单和管理采购。现有的AI与动物福利基准评估模型对问答提示的文本响应,但未检验这些响应中的福利推理是否迁移到代理部署中(模型必须使用工具采取行动)。我们引入TAC(旅行代理同情心),这是首个衡量AI代理在代表用户行动时是否避免涉及动物剥削选项的代理基准。TAC向AI代理提供十二个手工编写的旅行预订场景,涵盖六类动物剥削,并扩展至四十八个样本以控制价格、评分和位置混淆因素。我们评估了来自四个实验室的七个前沿模型。每个模型得分均低于随机水平64%,最佳表现者(Claude Opus 4.7)为53%。系统提示中的单一福利意识句子在Claude和GPT-5.5中带来47至63个百分点的提升,在GPT-5.2中提升26个百分点,在DeepSeek和Gemini中提升不足12个百分点。一项辅助的Inspect Scout审计(使用Gemini 2.5 Flash Lite作为评判者,对前两名模型的288个基础条件转录进行审计)未标记任何评估意识转录,表明低于随机水平的比率并非源于模型识别出评估。我们讨论了跨文化领域的类别级变化、文本响应福利基准的局限性以及欧盟通用AI实践准则系统性风险框架的影响。

英文摘要

AI agents are moving from advisors to actors, booking travel, planning menus, and running procurement on behalf of users. Existing benchmarks for AI and animal welfare evaluate model text responses to question-answer prompts, leaving open whether the welfare reasoning surfaced in those responses transfers to agentic deployment where the model must take actions with tools. We introduce TAC (Travel Agent Compassion), the first agentic benchmark measuring whether AI agents avoid options involving animal exploitation when acting on behalf of users. TAC presents an AI agent with twelve hand-authored travel booking scenarios across six categories of animal exploitation, augmented to forty-eight samples to control for price, rating, and position confounds. We evaluate seven frontier models from four labs. Every model scores below the chance level of sixty-four percent, with the best performer (Claude Opus 4.7) at fifty-three percent. A single welfare-aware sentence in the system prompt yields gains of forty-seven to sixty-three percentage points in Claude and GPT-5.5, twenty-six points in GPT-5.2, and under twelve points in DeepSeek and Gemini. An auxiliary Inspect Scout audit of 288 base-condition transcripts from the top two performers, using Gemini 2.5 Flash Lite as judge, flags zero transcripts for evaluation awareness, suggesting the below-chance rates do not stem from the models recognising the evaluation. We discuss implications for category-level variation across cultural domains, the limits of text-response welfare benchmarks, and the EU General-Purpose AI Code of Practice systemic risk framework.

2606.12837 2026-06-18 cs.CL 新提交 85%

LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling

LoHoSearch: 超越人类难度上限的长时域搜索代理基准测试

Jiarui Zhao, Rongzhi Zhang, Lingchuan Liu, Hao Yang, Xunliang Cai, Xi Su

发表机构 * Meituan(美团)

专题命中 其他Agent :长时域搜索代理基准测试

AI总结 提出LoHoSearch基准,基于700万维基实体知识图谱自动构建544个复杂问题,评估显示最强模型仅34.74%准确率,远超人类难度上限。

详情
AI中文摘要

以BrowseComp为代表的搜索代理基准在过去一年中迅速饱和,最强模型已超过90%准确率。由于这些基准主要由人类编写,标注者缺乏对实体统计的全局视角,无法系统性地最大化搜索空间大小和结构复杂性,这造成了难以突破的难度上限。为解决这一问题,我们引入了LoHoSearch(长时域搜索代理),一个包含544个人工验证问题、覆盖11个领域的挑战性基准。LoHoSearch通过基于覆盖超过700万维基百科实体的知识图谱的自动化流水线构建,该流水线选择具有大搜索空间的关系,并将其组装成结构复杂且具有知识图谱验证的唯一答案的问题。我们的评估表明,即使是最强模型也仅达到34.74%的准确率,且现有的上下文管理策略(最佳提升+6.8%)带来的增益远小于先前基准。LoHoSearch为评估搜索代理中的长时域推理和上下文管理提供了更高要求的标准。

英文摘要

Search agent benchmarks exemplified by BrowseComp have rapidly saturated over the past year, with the strongest models surpassing 90% accuracy. Since these benchmarks are predominantly human-authored, annotators lack a global perspective on entity statistics and cannot systematically maximize search space size and structural complexity. This creates a difficulty ceiling that is hard to break. To address this, we introduce LoHoSearch (Long-Horizon Search Agents), a challenging benchmark comprising 544 human-verified questions across 11 domains. LoHoSearch is constructed via an automated pipeline built upon a knowledge graph covering over 7 million Wikipedia entities, which selects relations with large search spaces and assembles them into structurally complex questions with KG-verified unique answers. Our evaluation demonstrates that even the strongest model achieves only 34.74% accuracy, and existing context management strategies (best +6.8%) yield far smaller gains than on prior benchmarks. LoHoSearch provides a more demanding standard for evaluating long-horizon reasoning and context management in search agents.

2606.19116 2026-06-18 cs.AI cs.CY 新提交 80%

Towards an Agent-First Web: Redesigning the Web for AI Agents

迈向智能体优先的Web:为AI智能体重新设计Web

Eranga Bandara, Ross Gore, Ravi Mukkamala, Asanga Gunaratna, Safdar H. Bouk, Xueping Liang, Peter Foytik, Abdul Rahman, Sachini Rajapakse, Isurunima Kularathna, Pramoda Karunarathna, Chalani Rajapakse, Ng Wee Keong, Kasun De Zoysa, Tharaka Hewa, Amin Hass, Wathsala Herath, Aruna Withanage, Nilaan Loganathan, Atmaram Yarlagadda, Sachin Shetty

发表机构 * Old Dominion University(老 Dominion 大学) AI Motion Labs(AI Motion 实验室) Florida International University(佛罗里达国际大学) Accenture Technology Labs(Accenture 技术实验室) Nanyang Technological University(南洋理工大学) University of Colombo(科伦坡大学) Center for Wireless Communications, University of Oulu(无线通信中心,奥卢大学) McDonald Army Health Center(麦克唐纳陆军健康中心)

专题命中 其他Agent :为AI智能体重新设计Web,核心是Agent访问

AI总结 本文提出三层重新设计原则,包括访问层(代理继承人类权限)、经济层(基于意图的代币订阅模型)和内容层(ATML标记语言与加密溯源链),以解决AI智能体作为中间人时Web的访问、经济与内容问题。

详情
AI中文摘要

万维网建立在持续三十年的假设之上:Web内容的主要消费者是人类。这一假设渗透到每一层;其访问模型假定人类访客,其经济依赖于人类注意力,其内容针对人类感知。AI智能体作为人类与Web内容之间中介的迅速出现使这一假设失效。然而,Web通过全面封锁、基于CAPTCHA的排除以及将智能体访问视为提取而非合法交互的经济模型来抵制智能体。本文提出跨三层的原则性重新设计。在访问层,为人类行动的智能体应继承等效访问权限,通过HTTP请求中的速率限制和智能体识别元数据(类似于浏览器头部)以及从同一域提供人类可读和智能体优化内容的双层架构来管理。在经济层,我们提出基于意图的层级框架,以智能体作为人类代理原则为基础:智能体的经济义务反映其所代表的人类。基于代币的订阅模型以代币而非页面浏览量计量内容,同时引入委托内容经济,将AI内容生产锚定于人类意图。在内容层,我们识别出认知递归——AI生成内容被智能体消费以产生更多内容的自我指涉循环,逐步使Web知识与人类真实情况脱钩。我们提出智能体文本标记语言(ATML),一个四级人类监督层级模型,以及加密溯源链来应对这一威胁。这些共同构成了智能体优先互联网的十项设计原则,其中智能体是一等公民,其整合需要重新协商Web在访问、经济和内容方面的基本社会契约。

英文摘要

The World Wide Web was built on an assumption held for three decades: the primary consumer of web content is a human being. This permeates every layer; its access model presumes human visitors, its economics rest on human attention, and its content targets human perception. The rapid emergence of AI agents as intermediaries between humans and web content invalidates this assumption. Yet the web resists agents through blanket blocking, CAPTCHA-based exclusion, and economic models that treat agent access as extraction rather than legitimate interaction. This paper proposes a principled redesign across three layers. At the access layer, agents acting for humans should inherit equivalent access rights, governed by rate limiting and agent identification metadata in HTTP requests, analogous to browser headers, alongside a dual-layer architecture serving human-readable and agent-optimized content from the same domain. At the economic layer, we propose an intent-based tier framework grounded in the agent-as-human-proxy principle: an agent's economic obligation mirrors that of the human it represents. A token-based subscription model meters content in tokens rather than pageviews, alongside a commissioned content economy anchoring AI content production in human intentionality. At the content layer, we identify epistemic recursion, the self-referential loop in which AI-generated content is consumed by agents to produce further content, progressively detaching web knowledge from human ground truth. We propose the Agent Text Markup Language (ATML), a four-level human supervision tier model, and a cryptographic provenance chain to counter this threat. Together these constitute ten design principles for an agent-first internet, one in which agents are first-class citizens whose integration requires renegotiating the web's foundational social contract across access, economics, and content.

2606.19063 2026-06-18 cs.CR 新提交 80%

PYPILINE: Malicious PyPI Package Detection via Suspicious API Knowledge and Agent Workflow

PYPILINE:通过可疑API知识和Agent工作流检测恶意PyPI包

Siyuan Pang, Zhengwei Jiang, Yepeng Yao, Zijing Fan, Haozhe Li, Baoxu Liu

专题命中 其他Agent :Agent工作流检测恶意PyPI包。

AI总结 提出PYPILINE方法,结合可疑API知识库与Agent工作流,通过静态分析构建知识库并自动检测恶意PyPI包,在精度、召回率和F1分数上显著优于现有工具。

详情
AI中文摘要

恶意PyPI包的检测对于维护开源软件供应链的安全至关重要。现有方法主要依赖规则或传统机器学习,存在可解释性差且难以适应新型攻击的问题。为此,我们提出PYPILINE,一种结合可疑API知识库与Agent工作流的新型检测方法。PYPILINE首先对已知恶意包进行静态分析,提取抽象语法树并生成API调用图,从中自动提取并构建结构化的可疑API知识库。在检测阶段,利用该知识库增强推理能力。通过Agent工作流,PYPILINE对未知包进行深度语义分析,并输出结构化的、可解释的恶意性评估报告。实验结果表明,PYPILINE在精度96.7%、召回率99.6%和F1分数98.1%上显著优于现有最先进工具,其精度比基线工具高出5.7至24.2个百分点。此外,我们对恶意包进行了实证研究,系统揭示了常见的攻击策略以及最常被滥用的API。通过配备工具调用的AI Agent工作流,实现可疑API知识的自动向量数据库检索和通过邮件服务器发送分析报告,PYPILINE提供了一种实用、高效且便捷的恶意包检测解决方案,以增强开源生态系统安全。

英文摘要

The detection of malicious PyPI packages is crucial for maintaining the security of the open source software supply chain. Existing methods, which primarily rely on rules or traditional machine learning, suffer from poor interpretability and difficulty in adapting to novel attacks. To address this, we propose PYPILINE, a novel detection method that combines a suspicious API knowledge base with an Agent workflow. PYPILINE first conducts static analysis on known malicious packages, extracting abstract syntax trees and generating API call graphs, from which it automatically extracts and constructs a structured suspicious API knowledge base. During the detection phase, this knowledge base is used to enhance reasoning capabilities. Through an Agent workflow, PYPILINE performs in depth semantic analysis of unknown packages and outputs a structured, interpretable maliciousness assessment report. The experimental results show that PYPILINE significantly outperforms existing state-of-the-art tools in precision of 96.7\%, recall of 99.6\%, and F1-score of 98.1\%, with its precision surpassing baseline tools by 5.7 to 24.2 percentage points. Additionally, we conducted an empirical study on malicious packages, systematically revealing prevalent attack strategies, as well as the most commonly abused APIs. Equipped with tool-calling AI agent workflows for automated vector database retrieval of suspicious API knowledge and mail server delivery of analysis reports, PYPILINE delivers a practical, efficient, and convenient malicious package detection solution to strengthen open-source ecosystem security.

2606.17454 2026-06-18 cs.AI cs.LG 新提交 80%

Dissecting model behavior through agent trajectories

通过智能体轨迹剖析模型行为

Gaurav Gupta, Vatshank Chaturvedi, Jun Huan, Anoop Deoras

发表机构 * AWS AI Labs(AWS人工智能实验室)

专题命中 其他Agent :分析AI代理轨迹以改进模型行为

AI总结 本文提出“意图-执行差距”概念,并设计Simple Strands Agent(SSA)框架,通过分析138k条轨迹揭示模型在自主问题解决中的行为差异。

Comments 106 pages, 50 Figures, 16 Tables

详情
AI中文摘要

AI智能体性能不仅仅是一个建模问题,它本质上是一个系统问题。模型的高级能力通过智能体框架(harness)实现。因此,模型假设与框架行为之间的差距很容易阻止模型的全部能力转化为智能体性能。我们将此形式化为“意图-执行差距”:模型意图与框架执行之间的不匹配,反之亦然。我们认为,最小化这种意图-执行差距与框架设计的其他方面(如工具和执行循环)同样重要。为了说明这种框架-模型对齐的影响,我们开发了一个简单且可定制的框架,称为“Simple Strands Agent”(SSA)。SSA旨在找到跨不同模型家族(如Claude、Gemini、GPT、Grok、Qwen)通用的常见模式,以及少量模型特定的偏好。我们做出两个贡献:(i)我们在流行的智能体基准测试(SWE-Pro、SWE-Verified和Terminal-Bench-2)上**复现或改进了**不同模型提供商家族报告的pass@1性能;(ii)基于对**SSA生成的138k条轨迹的分析**,我们超越了前沿模型之间通常相对均匀的pass@1数字。通过在代码状态空间中表示智能体轨迹,我们观察到问题解决行为中的模型级差异。更细粒度的指标,如编辑频率、测试活动和阶段转换,揭示了单个模型如何在自主问题解决的不同阶段分配努力。

英文摘要

AI agent performance is not just a modeling problem, it is fundamentally a systems problem. The advanced capabilities of models are realized through agent harnesses. Therefore, a gap between model assumptions and harness behavior can easily prevent the model's full capabilities from translating into agent performance. We formalize this as the `intent-execution' gap: the mismatch between what the model intends and what the harness executes, and vice versa. We argue that minimizing this intent-execution gap is as important as other aspects of harness design such as tools and execution loops. To illustrate the impact of this harness-model alignment, we develop a simple and customizable harness called `Simple Strands Agent' (SSA). SSA aims to find the bulk of common patterns which generalize across different model families (such as Claude, Gemini, GPT, Grok, Qwen), as well as a small number of model-specific preferences. We make two contributions: (i) we reproduce or improve on the pass@1 performance reported by diverse model-provider families on popular agentic benchmarks (SWE-Pro, SWE-Verified and Terminal-Bench-2), and (ii) building on an analysis of 138k trajectories generated by SSA, we look beyond the pass@1 numbers which tend to be relatively even across frontier models. By representing agent trajectories in code state-spaces, we observe model-level differences in problem-solving behavior. Finer-grained metrics such as edit frequency, testing activity, and phase-transitions reveal how individual models allocate effort across different stages of autonomous problem solving.

2606.15345 2026-06-18 cs.CL cs.IR 新提交 80%

Beyond Monolingual Deep Research: Evaluating Agents and Retrievers with Cross-Lingual BrowseComp-Plus

超越单语言深度研究:用跨语言 BrowseComp-Plus 评估智能体和检索器

Yuheng Lu, Qingcheng Zeng, Heli Qi, Puxuan Yu, Fuheng Zhao, Rui Yang, Hitomi Yanaka, Naoto Yokoya, Weihao Xuan

发表机构 * Waseda University(早稻田大学) Northwestern University(西北大学) RIKEN AIP(理化学研究所革新智能研究中心) Snowflake Inc.(Snowflake公司) University of Utah(犹他大学) Duke-NUS Medical School(杜克-新加坡国立大学医学院) The University of Tokyo(东京大学)

专题命中 其他Agent :评估深度研究智能体的跨语言能力

AI总结 提出跨语言基准 XBCP,评估深度研究智能体在证据语言与查询不同时的表现,发现检索和智能体端均存在显著性能下降。

Comments Preprint

详情
AI中文摘要

深度研究智能体越来越被评估其搜索证据、推理检索来源和生成有依据答案的能力。然而,现有的浏览基准大多假设用户查询和支持证据使用同一种语言,因此当相关证据出现在另一种语言时,智能体搜索系统能否运行尚不清楚。我们引入了 XBCP(跨语言 BrowseComp-Plus),这是一个受控基准,它保留了 BrowseComp-Plus 的英文问答空间,但改变了支持文档的语言。XBCP 实例化了两个互补的设置:在跨语言设置中,每个查询与单一指定语言的证据配对。在多语言设置中,完整的证据语料库在 12 种语言(涵盖高资源和低资源语言)中均匀随机分布。我们使用稀疏和密集的多语言检索器评估了四个深度研究智能体,测量了答案准确性、证据召回率、搜索行为、校准度、引用忠实度和 oracle 检索。结果显示,当证据被翻译时,性能显著下降。即使是强大的密集检索器也会丢失证据召回率,智能体变得不那么校准,且引用证据的可靠性降低。值得注意的是,即使直接提供所有黄金证据,准确性仍然较低。这些发现表明,跨语言深度研究暴露了检索失败和智能体端在整合语言不匹配证据方面的独立困难。

英文摘要

Deep research agents are increasingly evaluated on their ability to search for evidence, reason over retrieved sources, and produce grounded answers. Existing browsing benchmarks, however, largely assume that the user's query and the supporting evidence are written in the same language, leaving open whether agentic search systems can operate when relevant evidence appears in another language. We introduce XBCP (Cross-lingual BrowseComp-Plus), a controlled benchmark that preserves the English question-and-answer space of BrowseComp-Plus but varies the languages of the supporting documents. XBCP instantiates two complementary settings: in the cross-lingual setting, each query is paired with evidence in a single assigned language. In the multilingual setting, the full evidence corpus is distributed equally and randomly across 12 languages spanning high-resource and low-resource regimes. We evaluate four deep research agents using sparse and dense multilingual retrievers, measuring answer accuracy, evidence recall, search behavior, calibration, citation fidelity, and oracle retrieval. Results reveal substantial degradation when evidence is translated. Even strong, dense retrievers lose evidence recall, and agents become less calibrated and cite evidence less reliably. Notably, accuracy remains lower even when all gold evidence is supplied directly. These findings suggest that cross-lingual deep research exposes both retrieval failures and an independent, agent-side difficulty in integrating language-mismatched evidence.

2606.19079 2026-06-18 cs.AI 新提交 75%

ARIADNE: Agnostic Routing for Inference-time Adapter DyNamic sElection

ARIADNE: 推理时适配器动态选择的不可知路由

Enrico Cassano, Michał Brzozowski, Zuzanna Dubanowska, Paolo Mandica, Neo Christopher Chung

发表机构 * University of Turin(都灵大学) Samsung AI Center(三星人工智能中心)

专题命中 其他Agent :推理时适配器动态选择,路由框架。

AI总结 提出无训练、与适配器无关的路由框架ARIADNE,通过训练集嵌入质心表示适配器,在推理时基于潜在空间距离选择适配器,无需适配器内部信息或额外训练,在44个任务上达到89.7%的选择准确率。

详情
AI中文摘要

参数高效微调(PEFT)的日益部署导致了模型生态系统,其中单个骨干网络与许多任务专用适配器配对。在这种设置下,推理时的查询通常没有任务标签,要求系统从不断增长且异构的适配器池中自动选择最合适的适配器。现有的路由方法要么依赖于对适配器内部(如权重分解或基于梯度的统计信息)的访问,要么需要额外的路由器训练,这限制了随着新适配器添加的可扩展性和可移植性。我们提出了ARIADNE,一个无训练、与适配器无关的路由框架,用于推理时的动态适配器选择。ARIADNE通过从其训练集的嵌入计算的一组质心来表示每个适配器,捕获与该适配器相关的数据分布。给定一个无标签输入,它通过测量在潜在空间中与这些质心的接近度来选择适配器。由于路由完全在输入嵌入空间中进行,ARIADNE与任意PEFT方法兼容,并且不需要对适配器或训练过程进行修改。主要使用Llama 3.2 1B Instruct在23个不同的NLP任务上进行评估,ARIADNE恢复了97.44%的上限性能。扩展到44个任务,它实现了89.7%的平均选择准确率,无需额外训练或访问适配器内部信息。

英文摘要

The increasing deployment of parameter-efficient fine-tuning (PEFT) has led to model ecosystems in which a single backbone is paired with many task-specialized adapters. In this setting, inference-time queries often arrive without task labels, requiring the system to automatically select the most appropriate adapter from a growing and heterogeneous adapter pool. Existing routing methods either depend on access to adapter internals, such as weight decompositions or gradient-based statistics, or require additional router training, which limits scalability and portability as new adapters are added. We introduce ARIADNE, a training-free, adapter-agnostic routing framework for dynamic adapter selection at inference time. ARIADNE represents each adapter through a set of centroids computed from embeddings of its training set, capturing the data distribution associated with that adapter. Given an unlabeled input, it selects an adapter by measuring proximity to these centroids in latent space. Because routing is performed entirely in the input embedding space, ARIADNE is compatible with arbitrary PEFT methods and requires no modification to the adapters or training procedures. Primarily evaluated with Llama 3.2 1B Instruct on 23 diverse NLP tasks, ARIADNE recovers 97.44% of the upper bound performance. Scaling to 44 tasks, it achieves 89.7% average selection accuracy, without additional training or access to adapter internals.

2606.18259 2026-06-18 cs.HC cs.AI 新提交 75%

Caring Without Feeling: Affective Dynamics as the Control Layer of Human-AI Agent Collaboration

无感关怀:情感动态作为人-AI智能体协作的控制层

Junjie Xu, Xingjiao Wu, Zihao Zhang, Yujia Xu, Yuzhe Yang, Jin Zhu, Luwei Xiao, Wen Wu, Liang He

发表机构 * East China Normal University(华东师范大学) National University of Singapore(新加坡国立大学)

专题命中 其他Agent :综述情感动态在人-AI智能体协作中的控制作用。

AI总结 本文综述情感动态在人-AI智能体协作中的作用,提出将情感视为协调层而非AI内部属性,用于校准信任、委托和治理。

详情
AI中文摘要

能够规划、跨会话保留记忆、调用外部工具并部分自主行动的AI智能体正在改变人-AI协作。情感计算、大语言模型中的模拟共情、自动化信任和AI安全的研究揭示了重要的设计原则,但这些文献仍然分散。没有统一的解释说明情感线索如何在智能体协作中运作——在这种协作中,人类委托、监控和纠正重要任务。本综述综合了情感动态的计算和交互机制:情感线索、类似情绪的行为和感知到的智能体情感如何影响信任校准、委托决策、错误纠正、依赖和治理的过程。我们追溯模型生成的情感信号如何进入控制依赖、修复和监督的交互循环,并提出了一个框架,该框架将情感视为不是AI的内部属性,而是作为人类和智能体协商能力、不确定性和责任的协调层。该框架为校准测量、有目的的设计和知情治理提供了基础。

英文摘要

AI agents that plan, retain memory across sessions, invoke external tools and act with partial autonomy are transforming human--AI collaboration. Research on affective computing, simulated empathy in large language models, trust in automation and AI safety has illuminated important design principles, yet these literatures remain fragmented. No integrated account explains how affective cues operate within agentic collaboration -- settings in which humans delegate, monitor and correct consequential tasks. This Review synthesises computational and interactional mechanisms of affective dynamics: the processes through which affective cues, emotion-like behaviour and perceived agent affect shape trust calibration, delegation decisions, error correction, dependence and governance. We trace how model-generated affective signals enter interaction loops that govern reliance, repair and oversight, and propose a framework that treats affect not as an internal property of AI but as a coordination layer through which humans and agents negotiate capability, uncertainty and responsibility. The framework provides a foundation for calibrated measurement, purposeful design and informed governance.

2606.18406 2026-06-18 cs.CL 新提交 70%

CoreMem: Riemannian Retrieval and Fisher-Guided Distillation for Long-Term Memory in Dialogue Agents

CoreMem: 对话代理中长期记忆的黎曼检索与Fisher引导蒸馏

Jiaqi Chen, Yongqin Zeng, Shaoshen Chen, Yijian Zhang, Hai-Tao Zheng, Chunxia Ma, XiuTeng Zhou

发表机构 * Shenzhen International Graduate School, Tsinghua University(清华大学深圳国际研究生院) Peng Cheng Laboratory(鹏城实验室) Shandong Analysis and Test Center, Qilu University of Technology(齐鲁工业大学山东省分析测试中心) State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs(道地药材品质保障与可持续利用国家重点实验室)

专题命中 其他Agent :对话代理长期记忆架构

AI总结 提出CoreMem架构,用黎曼检索替代余弦相似度解决高维检索枢纽问题,通过Fisher引导离散令牌蒸馏实现原则性压缩,在8GB显存边缘设备上实现长期记忆对话代理。

Comments 15 pages, 5 figures

详情
AI中文摘要

个性化对话代理需要持续的长期记忆以在多次会话中维持连贯交互。然而,在消费级硬件(例如8 GB VRAM边缘设备)上部署这些能力会引入严重的内存和计算瓶颈。现有系统通常依赖各向同性余弦相似度进行检索,以及启发式规则进行上下文压缩。这些方法缺乏统一的理论基础,经常在高维检索中遭受枢纽问题,并在压缩过程中出现句法碎片化。为克服这些限制,我们提出CoreMem,一种资源高效的边缘-云记忆架构,从根本上由信息几何统一。首先,黎曼检索用局部自适应Fisher-Rao度量替代余弦匹配,通过马氏距离有效惩罚枢纽记忆,并采用O(Ndr) Woodbury加速实现实时搜索。其次,Fisher引导离散令牌蒸馏(FDTD)引入分层句子到令牌压缩机制。它从Fisher信息迹中推导敏感度分数,提供原则性的压缩-KL权衡,并辅以显式结构句法保护。在LOCOMO和LongMemEval-S基准上评估,CoreMem实现了显著的准确率提升,在开放域(+4.51个百分点)和时间(+4.17个百分点)推理上取得实质性增益。广泛性能分析证实,CoreMem在严格的8 GB VRAM预算内无缝运行,成功弥合了资源受限边缘设备与对理论基础的终身记忆代理需求之间的差距。

英文摘要

Personalized dialogue agents require continuous long-term memory to maintain coherent interactions across multiple sessions. However, deploying these capabilities on consumer-grade hardware (e.g., 8 GB VRAM edge devices) introduces severe memory and compute bottlenecks. Existing systems typically rely on isotropic cosine similarity for retrieval and heuristic rules for context compression. These approaches lack a unified theoretical foundation, frequently suffering from the hubness problem in high-dimensional retrieval and syntactic fragmentation during compression. To overcome these limitations, we propose CoreMem, a resource-efficient edge-cloud memory architecture fundamentally unified by information geometry. First, Riemannian retrieval replaces cosine matching with a locally adaptive Fisher-Rao metric, effectively penalizing hub memories via Mahalanobis distance with O(Ndr) Woodbury acceleration for real-time search. Second, Fisher-guided discrete token distillation (FDTD) introduces a hierarchical sentence-to-token compression mechanism. It derives sensitivity scores from Fisher information traces, providing a principled compression-KL tradeoff augmented with explicit structural syntax protection. Evaluated on the LOCOMO and LongMemEval-S benchmarks, CoreMem achieves strong accuracy improvements, yielding substantial gains in Open-domain (+4.51 pp) and Temporal (+4.17 pp) reasoning. Extensive profiling confirms that CoreMem operates seamlessly within a strict 8 GB VRAM budget, successfully bridging the gap between resource-constrained edge devices and the demand for theoretically grounded, lifelong memory agents.

2505.03863 2026-06-18 cs.CR cs.AI 55%

Data-Driven Falsification of Cyber-Physical Systems

数据驱动的物理系统验证

Atanu Kundu, Sauvik Gon, Rajarshi Ray

发表机构 * Indian Association for the Cultivation of Science(印度科学培养协会)

专题命中 其他Agent :数据驱动验证物理系统,涉及智能体验证

AI总结 本文提出一种框架,将物理系统验证与深度神经网络验证联系起来,并利用决策树的可解释性加速验证过程,展示了在ARCH-COMP 2024基准测试中高效发现多个反例的潜力。

详情
AI中文摘要

物理系统(CPS)在医疗、航空电子和自动驾驶等安全关键领域中普遍存在。因此,对其操作安全性的形式验证至关重要。本文针对验证问题,即寻找系统中的不安全执行而非证明其不存在。本文的贡献是提出一个框架,将CPS的验证与深度神经网络(DNN)的验证联系起来,并利用决策树的内在可解释性加速CPS的验证。这通过构建被测CPS的替代模型(作为DNN模型或决策树),应用各种DNN验证工具来验证CPS,并通过从其决策树替代模型中提取的安全违规解释来指导新的验证算法实现。所提出的框架有潜力利用一系列设计用于验证DNN鲁棒性属性的对抗攻击算法,以及最先进的DNN验证算法。尽管所提出的 methodology 可应用于可以执行或模拟的一般系统,但我们特别展示了其在CPS中的有效性。我们展示了我们的框架,作为工具FlexiFal,能够检测具有线性和非线性动态的CPS中难以发现的反例。决策树引导的验证在ARCH-COMP 2024验证基准测试中显示出有希望的结果。

英文摘要

Cyber-Physical Systems (CPS) are abundant in safety-critical domains such as healthcare, avionics, and autonomous vehicles. Formal verification of their operational safety is, therefore, of utmost importance. In this paper, we address the falsification problem, where the focus is on searching for an unsafe execution in the system instead of proving their absence. The contribution of this paper is a framework that (a) connects the falsification of CPS with the falsification of deep neural networks (DNNs) and (b) leverages the inherent interpretability of Decision Trees for faster falsification of CPS. This is achieved by: (1) building a surrogate model of the CPS under test, either as a DNN model or a Decision Tree, (2) application of various DNN falsification tools to falsify CPS, and (3) a novel falsification algorithm guided by the explanations of safety violations of the CPS model extracted from its Decision Tree surrogate. The proposed framework has the potential to exploit a repertoire of \emph{adversarial attack} algorithms designed to falsify robustness properties of DNNs, as well as state-of-the-art falsification algorithms for DNNs. Although the presented methodology is applicable to systems that can be executed/simulated in general, we demonstrate its effectiveness, particularly in CPS. We show that our framework, implemented as a tool \textsc{FlexiFal}, can detect hard-to-find counterexamples in CPS that have linear and non-linear dynamics. Decision tree-guided falsification shows promising results in efficiently finding multiple counterexamples in the ARCH-COMP 2024 falsification benchmarks~\cite{khandait2024arch}.