arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.10938 2026-06-10 cs.LG 新提交

A Systematic Approach for Selecting Trajectories for Data Augmentation

一种系统化的轨迹数据增强选择方法

Adam Nordling

AI总结提出系统化框架评估五种轨迹选择策略（异常性、多样性、代表性、不确定性和随机），在四个数据集上测试，发现异常性和不确定性策略在稀疏数据中提升稳定性，但在密集数据中可能引入噪声。

详情

Comments: 39 pages, 4 figures, Masters project

AI中文摘要

轨迹数据增强是缓解机器学习应用中数据稀缺问题的一种有前景的方法，但其效用因保持时空一致性的复杂性而受到限制。尽管先前的工作证明了几何扰动的可行性，但它依赖于简单的随机选择，在理解哪些轨迹应被增强以获得最大收益方面留下了关键空白。本文通过开发一个系统且可扩展的框架来评估五种系统选择策略：异常性、多样性、代表性、不确定性和随机选择，填补了这一空白。这些策略在四个数据集（涵盖动物行为（Foxes和Starkey）、海上交通（AIS）和城市交通（Car））上使用一系列线性和非线性机器学习模型进行了严格测试。作为评估的一部分，集成了基于Optuna的超参数优化循环，以在探索的搜索空间内经验性地确定每个数据集的最佳增强参数。结果表明，虽然系统选择并非通用解决方案，但它比随机基线具有明显优势。系统策略，特别是异常性和不确定性，表现出更高的稳定性，并且在密集数据集中不易出现随机采样观察到的性能下降。然而，研究结果也表明，增强的价值是有严格条件的。通过UMAP的可视化分析表明，虽然系统增强成功修复了稀疏数据集中的拓扑碎片化，但在高质量密集数据集中，它可能充当破坏性噪声信号。此外，研究还发现了高速度域中的物理限制，其中标准扰动技术导致特征空间中的发散。

英文摘要

Trajectory data augmentation is a promising approach to mitigate data scarcity in machine learning applications, but its utility has been limited by the complexity of preserving spatio-temporal coherence. Although prior work demonstrated the viability of geometric perturbation, it relied on naive random selection, leaving a critical gap in understanding which trajectories should be augmented for maximal benefit. This thesis addresses this gap by developing a systematic and scalable framework to evaluate five systematic selection strategies: Outlierness, Diversity, Representativeness, Uncertainty, and Random selection. These strategies were rigorously tested across four datasets covering animal behavior (Foxes and Starkey), maritime traffic (AIS), and urban traffic (Car) using a suite of linear and non-linear machine learning models. As part of this evaluation, an Optuna-based hyperparameter optimization loop was integrated to empirically identify the best-performing augmentation parameters for each dataset within the explored search space. The results indicate that, while systematic selection is not a universal solution, it offers distinct advantages over the random baseline. Systematic strategies, particularly Outlierness and Uncertainty, demonstrated higher stability and were less prone to performance degradation observed with random sampling in dense datasets. However, the findings also reveal that the value of augmentation is strictly conditional. Visual analysis via UMAP demonstrates that while systematic augmentation successfully repairs topological fragmentation in sparse datasets, it can act as a corrupting noise signal in high-quality, dense datasets. Furthermore, the study identified physical limitations in high-velocity domains, where standard perturbation techniques lead to divergence in feature space...

URL PDF HTML ☆

赞 0 踩 0

2606.10937 2026-06-10 cs.DB cs.AI 新提交

Provenance Tracking in AI Compilers through the Lens of Coalgebra

通过余代数视角追踪AI编译器中的来源

Zilu Tian, Liying Liu

AI总结针对AI编译器中图重写导致来源难以追踪的问题，提出基于观测语义的轻量级方法，利用余代数和互模拟形式化，并在原型编译器COVAN中验证。

详情

AI中文摘要

AI编译器通过规范化、降级和优化积极重写计算图，使得跨编译追踪张量和运算符的来源变得困难。可靠的来源对于附加平台特定的后处理、调试编译器行为以及验证变换至关重要，然而现有解决方案在非单射图重写下要么是侵入式的，要么是特设的。我们提出了一种基于观测语义的轻量级生成式方法来追踪来源。我们不通过编译器传递传播标识符，而是观测图变换并根据可观测的计算行为推理来源。我们使用余代数模型和互模拟形式化了这种方法，即使中间节点被消除，也能保留来源。此外，我们在原型AI编译器COVAN中实现了该方法，展示了在编译流水线中稳定的来源追踪，且工程开销最小。

英文摘要

AI compilers aggressively rewrite computation graphs through normalization, lowering, and optimization, making it difficult to track the provenance of tensors and operators across compilation. Reliable provenance is essential for attaching platform-specific postprocessing, debugging compiler behavior, and validating transformations, yet existing solutions are either invasive or ad hoc under non-injective graph rewrites. We present a lightweight, generative approach to provenance tracking based on observational semantics. Instead of propagating identifiers through compiler passes, we observe graph transformations and reason about provenance in terms of observable computational actions. We formalize this approach using a coalgebraic model and bisimulation, which preserves provenance even when intermediate nodes are eliminated. Furthermore, we implement this approach in a prototype AI compiler COVAN, demonstrating stable provenance across compilation pipelines with minimal engineering overhead.

URL PDF HTML ☆

赞 0 踩 0

2606.10934 2026-06-10 cs.AI 新提交

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

WorldKernel：世界模型是可能世界的耦合核

Fabio Rovai

AI总结本文发现强预测器在反事实耦合上失效，提出将世界模型建模为可能世界上的半正定耦合核，其非对角元编码反事实信息，并通过半正定性约束和逻辑公理实现高效推理。

详情

AI中文摘要

一个常见的假设认为，给足够强的预测器提供足够的观测和干预数据就足够了。我们报告了一个与之矛盾的失败模式。在数百个结构因果模型中，对于已识别的量，强预测器和贝叶斯基线都成功，但对于未识别的量（反事实世界之间的耦合），预测器坍缩为一个点，在28%的模型上坍缩到没有有效模型能产生的点，而真实情况是一个可容许区间，更多数据永远不会缩小这个区间。这种差距是结构性的：预测无法表示反事实耦合上的不确定性。我们将世界模型建模为可容许世界上的单个半正定耦合核K(T,T')，其对角线是普通后验（预测器恢复的内容），非对角线是它无法恢复的跨世界耦合，每个反事实都读取这个耦合。本文就是关于这个非对角元的理论。它是真实的：两个具有相同后验的状态在跨世界查询上不同，而非对角元正是固定反事实的耦合。它是有界的：半正定性是边际分布缺乏的部分识别信息，强制执行它可以在多项式时间内对反事实进行有界化，而精确的响应类型程序是难处理的。逻辑结构使其更精确：本体论公理将边界收紧多达三分之一，并传播到它们从未触及的耦合。它是可获取的：有针对性的疤痕，即从遇到的不可行性中学习到的约束，比无针对性的疤痕快几倍地缩小差距。它的完全重构是对可容许世界的近似计数，在Sly-Sun阈值以下是易处理的，在此之上是难近似的；我们不声称能击败最坏情况。

英文摘要

A common assumption holds that enough observational and interventional data, given to a strong enough predictor, suffices. We report a failure mode that contradicts it. Across hundreds of structural causal models, on identified quantities a strong predictor and a Bayesian baseline both succeed, but on unidentified quantities (the couplings between counterfactual worlds) the predictor collapses to a point, on 28% of models to one no valid model can produce, while the truth is an admissible interval more data never narrows. The gap is structural: prediction cannot represent uncertainty over counterfactual couplings. We cast a world model as a single positive semidefinite coupling kernel K(T,T') over admissible worlds, whose diagonal is the ordinary posterior (what a predictor recovers) and whose off-diagonal is the cross-world coupling it cannot, which every counterfactual reads. The paper is the theory of that off-diagonal. It is real: two states with identical posteriors differ on a cross-world query, and the off-diagonal is the coupling that fixes counterfactuals. It can be bounded: positive semidefiniteness is partial-identifying information the marginals lack, and enforcing it bounds counterfactuals in polynomial time where the exact response-type program is intractable. Logical structure sharpens it: ontology axioms tighten the bound by up to a third, propagating to couplings they never touch. It can be acquired: targeted scars, constraints learned from encountered infeasibilities, close the gap several times faster than untargeted ones. Its full reconstruction is approximate counting of the admissible worlds, tractable below the Sly-Sun threshold and inapproximable above; we do not claim to beat the worst case.

URL PDF HTML ☆

赞 0 踩 0

2606.10902 2026-06-10 cs.CV cs.AI 新提交

Pose-ICL: 3D-Aware In-Context Learning for Pose-Controllable Subject Customization

Pose-ICL：面向姿态可控主体定制的3D感知上下文学习

Xuan Han, Yihao Zhao, Mingyu You

AI总结提出Pose-ICL框架，通过3D感知上下文学习和表面锚定位置嵌入（SAPE）实现无调优的姿态可控主体定制，显著提升姿态准确性和身份一致性。

详情

AI中文摘要

主体定制是现代图像生成中的基础任务。通过提供少量参考图像和文本提示，用户可以生成特定对象在任意期望场景中的图像。然而，现有方法在实现定制主体的有效姿态控制方面仍存在困难。在实践中，它们常常表现出不准确的姿态或不一致的跨姿态外观。这些局限性表明，对于2D原生骨干网络而言，以体积方式理解对象仍然是一个重大挑战。为了应对这一挑战，我们提出了Pose-ICL，这是一个无需调优的框架，利用3D感知上下文学习（ICL）通过多个配对的图像-姿态参考直接适应新主体。其核心机制——表面锚定位置嵌入（SAPE）——通过将图像令牌锚定到体积边界框的表面坐标，赋予模型显式的3D感知能力。专门的优化确保了其与现有DiT模型的无缝兼容性。在3D资产和真实世界主体上的广泛评估表明，Pose-ICL在姿态准确性和身份一致性方面均显著优于当前方法。

英文摘要

Subject Customization is a foundational task in modern image generation. By providing a few reference images and a text prompt, users can generate images of a specific object in any desired scene. However, existing methods still struggle to achieve effective pose control for customized subjects. In practice, they often exhibit inaccurate poses or inconsistent cross-pose appearances. These limitations suggest that understanding objects in a volumetric manner remains a significant challenge for 2D-native backbones. To address this challenge, we propose Pose-ICL, a tuning-free framework that leverages 3D-aware In-Context Learning (ICL) to directly adapt to new subjects through multiple paired image-pose references. Its core mechanism,Surface-Anchored Position Embedding (SAPE), equips the model with explicit 3D awareness by anchoring image tokens to the surface coordinates of a volumetric bounding box. Dedicated refinements ensure its seamless compatibility with existing DiT models. Extensive evaluations on both 3D assets and real-world subjects demonstrate that Pose-ICL significantly outperforms current methods in both pose accuracy and identity consistency.

URL PDF HTML ☆

赞 0 踩 0

2606.10894 2026-06-10 cs.CV 新提交

The 1st PortraitCraft Challenge: A CVPR 2026 Workshop Competition on Portrait Composition Understanding and Generation

首届PortraitCraft挑战赛：CVPR 2026研讨会肖像构图理解与生成竞赛

Zijie Lou, Youyun Tang, Xiaochao Qu, Haoxiang Li, Ting Liu, Luoqi Liu, Xun Zhu, Zheng Zhang, Xi Chen, Miao Li, Ji Wu, Dizhe Zhang, Xian Ge, Sujia Wang, Ruiyang Zhang, Jiaming Wang, Xianshun Wang, Lu Qi, Boao Kang, Wei Zhou, Jinghui Sun, Zhenyu Yan, Jiliang Zhao, Rui Yang, Yipo Huang, Boyuan Liu, Shanglin Li, Zifan Xie, Yichen Zhang, Anlan Wang, Wenfeng Lin, Mingyu Guo, Dong Li, Xinghao Wang, Yanting Li, Shanzhao Tong, Shuai He, Qiu Zhou, Yongqi Yang, Taoyang Mu, Dianqiao Lei, Anlong Ming, Huadong Ma

AI总结提出PortraitCraft挑战赛，包含构图理解与生成两个赛道，并发布约5万张肖像数据集，推动肖像美学与可控图像生成研究。

详情

AI中文摘要

本文介绍了首届PortraitCraft挑战赛的概况，该挑战赛是CVPR 2026的官方竞赛之一。挑战赛聚焦于肖像构图理解与生成，旨在推动AI在肖像美学分析和可控图像合成方面的研究。与主要关注全局美学评分的现有数据集和任务不同，PortraitCraft引入了一个统一的评估框架，包含两个互补赛道。赛道1要求模型进行结构化肖像构图理解，赛道2要求模型在显式构图约束下从结构化构图描述生成肖像图像。为支持该挑战赛，我们构建并公开发布了一个大规模肖像构图数据集，包含约50,000张精心策划的真实肖像图像，提供多级监督。本报告描述了挑战赛设置、评估协议、数据集组成和最终结果，并分析了提交方案的技术特点。PortraitCraft挑战赛为肖像构图理解与生成研究提供了一个标准化和可复现的平台，有望推动肖像美学和可控图像生成领域的进一步发展。

英文摘要

This paper presents an overview of the inaugural PortraitCraft Challenge, held as one of the official competitions at CVPR 2026. The challenge focuses on portrait composition understanding and generation, aiming to advance AI research in portrait aesthetics analysis and controllable image synthesis. Unlike existing datasets and tasks that primarily focus on global aesthetic scoring, PortraitCraft introduces a unified evaluation framework comprising two complementary tracks. Track 1 requires models to perform structured portrait composition understanding, and Track 2 requires models to generate portrait images from structured composition descriptions under explicit compositional constraints. To support the challenge, we constructed and publicly released a large-scale portrait composition dataset consisting of approximately 50,000 curated real portrait images, providing multi-level supervision. This report describes the challenge setup, evaluation protocols, dataset composition, and final results, along with an analysis of the technical characteristics of the submitted solutions. The PortraitCraft Challenge provides a standardized and reproducible platform for research on portrait composition understanding and generation, and is expected to foster further progress in the fields of portrait aesthetics and controllable image generation.

URL PDF HTML ☆

赞 0 踩 0

2606.10892 2026-06-10 cs.CV cs.AI 新提交

Improving Text-Instance Alignment Of Foreground Conditioned Out-Painting Via Customized Concept Embedding

通过定制化概念嵌入改进前景条件外绘中的文本-实例对齐

Yihao Zhao, Xuan Han, Bin He, Mingyu You

AI总结针对前景条件外绘中文本驱动方法产生的伪影问题，提出定制化概念嵌入扩散框架，通过实例感知损失和语义保持提示模板定制概念嵌入，显著减少伪影并提升图像质量。

详情

AI中文摘要

为了展示产品，商家通常需要花费大量成本制作高质量的展示图像。前景条件外绘（FCO）满足了这一需求，允许用户通过调整文本提示，以低成本为前景实例创建所需的背景。然而，现有的文本驱动FCO方法在其输出中存在关键缺陷，最明显的是伪影，即合成背景中与前景实例共享相同语义的区域。这种伪影降低了物体的显著性并降低了图像质量。我们将问题归因于给定实例与文本派生概念嵌入之间的不对齐。为了解决这个问题，我们提出了定制化概念嵌入扩散（CCE-Diffusion）框架。其核心是CCE模块，用于定制概念嵌入，弥合通用名词语义与特定视觉实例之间的差距。实例感知损失指导模块的优化，而语义保持提示模板防止定制化嵌入扭曲提示中的其他词。定性和定量评估均表明，CCE-Diffusion显著减少了输出中的伪影。作为即插即用组件，CCE模块可以集成到各种FCO方法中，提升其性能。

英文摘要

To showcase products, merchants often incur substantial costs creating high-quality display images. Foreground Conditioned Outpainting (FCO) meets this demand, allowing users to create desired backgrounds for foreground instances at a low cost by adjusting the text prompt. However, existing text-driven FCO methods exhibit critical flaws in their outputs, most notably the presence of artifacts, which refer to regions in the synthesized background that share the same semantics as the foreground instance. Such artifacts diminish the object's prominence and degrade image quality. We attribute the issue to the misalignment between the given instance and text-derived concept embeddings. To address this, we propose the Customized Concept Embedding Diffusion (CCE-Diffusion) framework. Its core is a CCE-Module to customize concept embeddings, bridging the gap between generic noun semantics and a specific visual instance. An Instance-Aware Loss guides the module's optimization, while a Semantic-Preserving Prompt Template prevents customized embeddings from distorting other words in the prompt. Both qualitative and quantitative evaluations demonstrate that CCE-Diffusion significantly reduces artifacts in the outputs. As a plug-and-play component, the CCE-Module can integrate with various FCO methods, enhancing their performance.

URL PDF HTML ☆

赞 0 踩 0

2606.10887 2026-06-10 cs.CV 新提交

Listen, Look, and Learn: Learning Without Forgetting through SAM-Audio

听、看、学：通过SAM-Audio实现无遗忘学习

Avi Gupta, Nilotpal Sinha, Vishnu Raj, Sambuddha Saha, Pratik Joshi, Koteswar Rao Jerripothula, Tammam Tillo

AI总结提出一种利用SAM-Audio多模态先验的类增量学习方法，通过引导注意力机制和双层蒸馏策略，在音频-视觉场景中缓解灾难性遗忘，性能优于现有方法。

详情

AI中文摘要

类增量学习（CIL）旨在持续学习新类别而不遗忘先前获取的知识。尽管最近的CIL进展在各种模态中引起了显著兴趣，但音频-视觉设置仍未被充分探索。此外，尽管像SAM-Audio这样的基础多模态模型封装了丰富的静态先验，我们的实证分析表明，这些表示在增量设置中表现不佳。本文通过将SAM-Audio的音频-视觉先验整合到CIL设置中来弥合这一差距。具体来说，我们利用其密集的音频和视觉表示，并采用一种新颖的引导注意力策略，其中音频特征在上下文中引导视觉表示。为了进一步缓解灾难性遗忘，我们在特征和logit级别引入了双层蒸馏目标。在音频-视觉CIL基准上的广泛评估表明，我们的方法始终优于最先进的方法。

英文摘要

Class-Incremental Learning (CIL) aims to continuously learn new classes without forgetting previously acquired knowledge. While recent CIL advances have spurred significant interest across various modalities, the audio-visual setting remains underexplored. Furthermore, although foundational multimodal models like SAM-Audio encapsulate rich static priors, our empirical analysis reveals that these representations struggle in incremental settings. This work bridges this gap by integrating SAM-Audio's audio-visual priors into the CIL setting. Specifically, we leverage its dense audio and visual representations and employ a novel guided attention strategy where the audio features contextually guide the visual representations. To further mitigate catastrophic forgetting, we introduce dual-level distillation objectives at both the feature and logit levels. Extensive evaluations on audio-visual CIL benchmarks demonstrate that our approach consistently outperforms state-of-the-art methods.

URL PDF HTML ☆

赞 0 踩 0

2606.10881 2026-06-10 cs.AI 新提交

Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook

学习者能动性与自主性的大规模语义映射揭示测量与生成式AI研究的忽视

Fei Qin, Xiaobo Liu, Yaowen Zhang, Xuming Li, Fei Wang, Mutlu Cukurova, Jingjing Chen, Yu Zhang

AI总结通过语义分析管道从超过14,000篇出版物中提取定义和量表项目，发现学习者能动性与自主性包含任务、个人和社会文化三个维度，现有量表忽视社会文化维度，且生成式AI研究过度聚焦学习调节与控制。

详情

Comments: 45 pages, 12 figures, 1 table, including appendices

AI中文摘要

学习者能动性和自主性是个人发展的基础，然而普遍存在的“叮当谬误”（即相同术语指代不同构念，不同术语指代相同构念）严重阻碍了知识的积累。将意义视为通过语言实践中的使用构成的现象，我们从超过14,000篇出版物中提取了8,954个定义和2,700个量表项目，通过语义分析管道研究研究人员实际如何使用学习者能动性和自主性。这两个构念的定义景观解析为三个维度：学习的调节与控制（任务）、内在动机与内部决策（个人）以及社会关系行动（社会文化），从而经验性地量化了叮当谬误。然而，现有量表系统性地低估了社会文化维度。关键的是，当前教育领域的生成式AI研究集中于学习调节与控制，缩小了AI中介学习环境旨在培养的行为库。除了概念澄清外，这项工作对支持多维学习者能动性和自主性的概念化、测量和实践具有直接意义。

英文摘要

Learner agency and autonomy are foundational to personal development, yet a pervasive "jingle-jangle" fallacy (i.e. identical terms denoting different constructs, distinct terms denoting identical ones) has substantially hindered cumulative knowledge. Treating meaning as a phenomenon constituted through use in linguistic practice, we extracted 8,954 definitions and 2,700 scale items from over 14,000 publications, to investigate how researchers actually used learner agency and autonomy with a semantic analysis pipeline. The definitional landscape of two constructs resolves into three dimensions: regulation and control of learning (task), intrinsic motivation and internal decision-making (person), and social-relational action (sociocultural), thereby empirically quantifying the jingle-jangle fallacy. Existing scales, however, systematically underrepresent the sociocultural dimension. Critically, current generative AI research in education concentrates on learning regulation and control, narrowing the behavioral repertoire that AI-mediated learning environments are designed to cultivate. Beyond conceptual clarification, this work carries direct implications for conceptualization, measurement, and practice towards supporting the multidimensional learner agency and autonomy.

URL PDF HTML ☆

赞 0 踩 0

2606.10876 2026-06-10 cs.CV 新提交

Advancing Wood Identification in the Philippines: Utilizing the Xylorix Platform for Efficient AI Model Development and Deployment for Five Key Species

推进菲律宾木材识别：利用Xylorix平台高效开发和部署五种关键树种的AI模型

Rosalie C. Mendoza, Vivian C. Daracan, Arlene D. Romano, Ronniel D. Manalo, Xin Jie Tang, Yi Hong Wong, Yong Haur Tay

AI总结本研究利用Xylorix平台，让无编程经验的木材科学家为五种菲律宾硬木开发并部署宏观木材识别AI模型，AUC达0.969-1.000，四种达AA级，证明非程序员可构建适合现场部署的可靠模型。

详情

AI中文摘要

非法采伐和木材贸易在菲律宾持续构成重大挑战，准确的木材物种识别对执法至关重要，但受限于专业设备和专业知识。本研究旨在评估木材科学家能否在没有编程专业知识的情况下，利用Xylorix平台开发和部署宏观木材识别的AI模型，聚焦五种菲律宾硬木：Mangium (Acacia mangium Willd.)、Rain Tree [Samanea saman (Jacq.) Merr.]、Banuyo (Wallaceodendron celebicum Koord.)、Tindalo [Afzelia rhomboidea (Blanco) Vidal] 和 Ipil [Intsia bijuga (Colebr.) O. Kuntze]。二元分类器使用来自260个标本的10,663张经过验证的横截面图像进行训练，并通过标本级平均评分进行评估，以模拟操作现场条件。ROC曲线下面积（AUC）值范围为0.969（Ipil）到1.000（Mangium），平均精度（AP）值范围为0.589（Samanea）到1.000（Mangium）。五个物种中有四个达到AA级（AUC和AP均≥0.90）；Rain Tree获得AE级（AUC≥0.90，AP<0.60），原因是其正测试集较小（3个标本）导致AP压缩。所有五个分类器以近乎完美的保真度将目标标本排在非目标标本之上。标本级错误分析显示，Ipil有9个假阴性，主要源于局部图像伪影；Rain Tree有3个假阳性，Tindalo有1个假阳性，由共享的族级解剖特征引起。这些发现表明，Xylorix非程序员可以利用Xylorix平台构建操作可靠的木材识别模型，适用于供应链检查点的现场部署。

英文摘要

Illegal logging and timber trade continue to pose significant challenges in the Philippines, where accurate wood species identification is essential for enforcement but limited by the need for specialised equipment and expertise. This study aims to evaluate whether AI models for macroscopic wood identification can be developed and deployed by wood scientists without programming expertise using the Xylorix platform, focusing on five Philippine hardwood species: Mangium (Acacia mangium Willd.), Rain Tree [Samanea saman (Jacq.) Merr.], Banuyo (Wallaceodendron celebicum Koord.), Tindalo [Afzelia rhomboidea (Blanco) Vidal], and Ipil [Intsia bijuga (Colebr.) O. Kuntze]. Binary classifiers were trained on 10,663 verified cross-section images from 260 specimens and evaluated using specimen-level mean scoring to mirror operational field conditions. Area Under the ROC Curve (AUC) values ranged from 0.969 (Ipil) to 1.000 (Mangium), and Average Precision (AP) values ranged from 0.589 (Samanea) to 1.000 (Mangium). Four of five species achieved AA grade (AUC and AP both \geq 0.90); Rain Tree received AE (AUC \geq 0.90, AP < 0.60) due to AP compression from its small positive test set (3 specimens). All five classifiers rank their target specimens above non-target specimens with near-perfect fidelity. Specimen-level error analysis revealed 9 false negatives from Ipil, primarily stemming from localized image artifacts and 3 false positives for Rain Tree and 1 false positive for Tindalo caused by shared tribal-level anatomical traits. These findings demonstrate that Xylorix non-programmers can leverage the Xylorix platform to construct operationally reliable wood identification models suitable for field deployment at supply chain checkpoints.

URL PDF HTML ☆

赞 0 踩 0

2606.10861 2026-06-10 cs.SE cs.AI cs.HC 新提交

From Perception to Action: Can UI Interventions Foster Sustainable LLM Chatbot

从感知到行动：UI干预能否促进可持续的LLM聊天机器人

Nitish Patkar, Pooja Rani, Jack Glässer, Simon Lüscher, Martin Kropp

AI总结研究通过UI干预（如模式切换、能耗反馈）提升用户对LLM聊天机器人能耗的感知，并鼓励节能行为，发现模式切换是主要行为机制。

详情

AI中文摘要

基于LLM的聊天机器人日益融入日常工作流程，其能源使用引发了可持续性担忧。大多数缓解策略强调模型或基础设施效率，而用户界面层尽管具有塑造交互行为的潜力，却仍未得到充分探索。我们调查了面向可持续性的UI干预能否提高用户的能源意识，并鼓励更节能的聊天机器人使用，同时不降低可用性。我们首先进行了一项基线调查，有77名参与者评估了对干预概念的意识和接受度。在说服技术和选择架构的先前工作指导下，我们实现了一个基于Web的聊天机器人原型，具有三模式开关（节能、平衡、性能）、每次响应的能耗反馈、发送前能耗估计、使用指标仪表板和能耗类比。然后，我们在为期五天的实地研究中评估了该原型，有11名参与者。在基线调查中，94.8%的受访者报告至少对AI能耗有一定了解，但88.3%的人错误估计了实际消耗。尽管对环境影响的担忧很高，但只有39.0%的人表示愿意接受性能权衡以降低能耗。在实地研究中，节能模式占记录提示的55.8%，而90.9%的人自我报告在不需要高精度时主动选择Eco模式。参与者没有减少提示长度，表明模式切换是主要行为机制。面向可持续性的UI干预可以提高意识，并支持LLM聊天机器人中更节能的交互模式。这些效应最好被解释为行为和基于模型的估计，补充了后端效率工作，所提供的原型和复制包支持对能源感知对话式AI设计的进一步研究。

英文摘要

LLM-powered chatbots are increasingly embedded in everyday workflows, raising sustainability concerns due to their energy use. Most mitigation strategies emphasize model or infrastructure efficiency, while the user-interface (UI) layer remains underexplored despite its potential to shape interaction behavior. We investigate whether sustainability-oriented UI interventions can increase users' energy awareness and encourage more energy-responsible chatbot use without reducing usability. We first conducted a baseline survey with 77 participants to assess awareness and receptiveness to intervention concepts. Guided by prior work on persuasive technology and choice architecture, we implemented a web-based chatbot prototype with a three-mode switch (Energy-efficient, Balanced, Performance), per-response energy feedback, pre-send energy estimates, a usage metrics dashboard, and energy analogies. We then evaluated the prototype in a five-day field study with 11 participants. In the baseline survey, 94.8% of respondents reported at least some awareness of AI energy use, yet 88.3% misestimated actual consumption. Although concern about environmental impact was high, only 39.0% indicated willingness to accept a performance trade-off for lower energy use. In the field study, Energy-efficient mode accounted for 55.8% of logged prompts, while 90.9% self-reported actively choosing Eco-mode when high accuracy was not required. Participants did not reduce prompt length, suggesting mode switching as the primary behavioral mechanism. Sustainability-oriented UI interventions can improve awareness and support more energy-responsible interaction patterns in LLM chatbots. These effects are best interpreted as behavioral and model-based estimates that complement backend efficiency work, and the provided prototype and replication package support further research on energy-aware conversational AI design.

URL PDF HTML ☆

赞 0 踩 0

2606.10860 2026-06-10 cs.CR cs.CL 新提交

Training LLMs to Enforce Multi-Level Instruction Hierarchies via Gravity-Weighted Direct Preference Optimization

训练LLM通过重力加权直接偏好优化强制执行多级指令层次结构

Lena S. Bolliger, Lena A. Jäger

AI总结提出重力加权DPO（GW-DPO）方法，通过线性或双边调度加权冲突级别间的结构距离，结合层次分隔符和指令段嵌入，在Llama-3.1-8B-Instruct上提升多级指令优先级遵守率并降低过度拒绝率。

详情

AI中文摘要

生产级LLM接收来自信任级别差异极大的源的指令，但对每个令牌赋予统一的架构特权。这种结构漏洞使得恶意提示注入成为可能，更广泛地说，模型缺乏原则性方法来解决合法但冲突的指令。常见的基于训练的响应是教导模型显式的指令层次结构；然而，现有方法仅形式化三或四个级别，将所有违规视为同等严重，且很少评估所有成对级别交互。我们形式化了k级指令层次问题，并针对k=5实例化，得到十个成对优先级关系，合规模型必须强制执行。然后我们引入重力加权DPO（GW-DPO），一种偏好优化目标，其每个样本的偏移量在线性或双边调度下与冲突级别间的结构距离成比例，后者通过特权差距和受害级别的特权共同加权严重性。结合层次特定的分隔符令牌（Chen等人，2025）和指令段嵌入（ISE；Wu等人，2025），采用双边调度的GW-DPO在Llama-3.1-8B-Instruct上帕累托改进标准DPO和线性变体，提高宏观成对优先级遵守率，同时将过度拒绝率降至标准DPO的一半。消融实验将ISE隔离为拒绝阈值校准器，并将五级与三级训练重新定义为通用性与专业性的权衡。

英文摘要

Production LLMs receive instructions from sources with very different levels of trust, yet attend to every token with uniform architectural privilege. This is the structural vulnerability that enables malicious prompt injections and, more broadly, leaves models without a principled way to resolve conflicts between legitimate but competing instructions. A common training-based response is to teach models an explicit instruction hierarchy; existing approaches, however, formalize hierarchies of only three or four levels, treat all violations as equally severe, and rarely evaluate the full set of pairwise level interactions. We formalize a k-level instruction hierarchy problem and instantiate it for k=5, yielding ten pairwise priority relations that a compliant model must enforce. We then introduce Gravity-Weighted DPO (GW-DPO), a preference-optimization objective whose per-sample offset scales with the structural distance between conflicting levels under a linear or bilateral schedule, the latter weighting severity by both the privilege gap and the privilege of the victim level. Combined with hierarchy-specific delimiter tokens (Chen et al., 2025) and Instructional Segment Embeddings (ISE; Wu et al., 2025), GW-DPO with the bilateral schedule Pareto-improves over standard DPO and the linear variant on Llama-3.1-8B-Instruct, raising macro pairwise priority adherence while keeping over-refusal at half the standard DPO rate. Ablations isolate ISE as a refusal-threshold calibrator and recast five- versus three-level training as a generality-specialization tradeoff.

URL PDF HTML ☆

赞 0 踩 0

2606.10856 2026-06-10 cs.RO 新提交

An Exposure-Time-Aligned Primary-Path Architecture for Autonomous-Driving ECUs

一种曝光时间对齐的主路径架构用于自动驾驶ECU

Toru Saito, Yuki Hagura, Tatsuya Konishi, Satoru Mizusawa, Takumi Yajima

AI总结针对生产车辆从模块化多NN流水线向端到端自动驾驶过渡的需求，提出主路径、曝光时间对齐和共路径共存三项设计原则，在双SoC平台上实现平均296ms的延迟。

详情

AI中文摘要

虽然端到端（E2E）自动驾驶已成为主导研究方向，但在一个非平凡的过渡期内，量产车辆仍然依赖模块化的多NN流水线。本文的主题是设计一种架构，在此阶段支持模块化流水线和E2E路径并行，并嵌入一条用于分阶段迁移的路径。移植到量产SoC上，平等主义的后期融合计算效率低下，且没有自然单元用于分阶段的E2E替代。作为替代方案，我们提出三项设计原则：（i）主路径，明确选择一条主要感知链，并优先将其封装在单个SoC对中，而非关键路径；（ii）曝光时间对齐，将主传感器的曝光时间τ_exp作为标签沿链传播，并在匹配的τ_exp上事件驱动融合节点，而非固定周期；（iii）共路径共存，基于（i）和（ii），让E2E输出路径与模块化流水线在同一τ_exp周期内并行运行。在双SoC量产AD-ECU上，实现从相机快门到规划器输出的平均延迟为296毫秒，在350毫秒的设计预算内。在（iii）下，模块化流水线在生产启动时为主路径，E2E路径作为影子在实车上运行，随着评估证据的积累，E2E范围逐步扩大。

英文摘要

While end-to-end (E2E) autonomous driving has become the dominant research direction, production vehicles continue to rely on modular multi-NN pipelines for a non-trivial transitional period. The subject of this paper is the design of an architecture that, during this phase, supports a modular pipeline and an E2E path side by side and embeds a path for staged migration. Transplanted to a production SoC, egalitarian late fusion is compute-inefficient and offers no natural unit for staged E2E substitution. As an alternative, we propose three design principles: (i) Primary-Path, which explicitly selects a primary perception chain and prioritizes its enclosure within a single SoC pair over the non-critical paths (ii) Exposure-Time-Aligned, which propagates the primary sensor's exposure time $τ_{\rm exp}$ as a tag along the chain and event-drives the fusion node on matched $τ_{\rm exp}$ rather than a fixed cycle and (iii) Co-Path Coexistence, which, building on (i) and (ii), lets an E2E output path co-run with the modular pipeline within the same $τ_{\rm exp}$ cycle. On a Dual-SoC production AD-ECU, the implementation closes camera-shutter to planner-output latency at a mean of 296 ms within the 350 ms design budget. Under (iii), the modular pipeline is primary at production launch and the E2E path runs as shadow on real vehicles, and the E2E scope is expanded as evaluation evidence accumulates.

URL PDF HTML ☆

赞 0 踩 0

2606.10827 2026-06-10 cs.NI cs.AI 新提交

A Unified Siamese Learning Framework for Zero-Day Anomaly Detection and Classification in Optical Networks

面向光网络中零日异常检测与分类的统一孪生学习框架

Carlos Natalino, Flávia P. Monteiro, Paolo Monti

AI总结提出多相似度孪生神经网络，统一实现光网络中零日异常检测与单样本分类，无需重训练即可跨光路和未知异常类型达到99%以上准确率。

2606.10811 2026-06-10 cs.CV 新提交

Deep learning for echo sounder data

深度学习用于回声测深仪数据

Ketil Malde

AI总结本文探讨深度学习在声学数据（如回声图）中的应用，指出由于声学数据特性，需开发专用方法而非简单复用图像处理模型，并强调缺乏标准数据集和格式是主要障碍。

2606.10798 2026-06-10 cs.LG 新提交

CITRAS-FM: Tiny Time Series Foundation Model for Covariate-Informed Zero-Shot Forecasting

CITRAS-FM: 面向协变量信息零样本预测的微型时间序列基础模型

Yosuke Yamaguchi, Issei Suemitsu, Yuki Kajihara, Wenpeng Wei

AI总结提出CITRAS-FM，一个仅7M参数的时间序列基础模型，通过引入Shifted Attention和协变量合成方法CovSynth，实现高效零样本预测，在100个任务上达到子10M模型最优精度且CPU推理时间低于0.1秒。

详情

Comments: Accepted to EUSIPCO 2026

AI中文摘要

预训练的时间序列基础模型（TSFMs）已实现对未见目标序列的零样本预测。然而，现有TSFMs通常计算成本高，对多样变量类型的支持有限，且往往未能考虑外生影响目标变异的协变量。为解决这些挑战，我们提出CITRAS-FM，一个仅7M参数的微型TSFM，支持单变量、多变量和协变量信息零样本预测，并实现实时CPU推理。基于补丁化的仅解码器Transformer，CITRAS-FM在跨变量模块中引入Shifted Attention，以有效利用在整个预测范围内可获取的已知协变量。此外，为了在协变量丰富语料稀缺的情况下实现协变量感知预训练，我们提出CovSynth，从目标序列的分解成分中合成逼真的协变量。在fev-bench上的实验（涵盖不同设置下的100个任务）表明，CITRAS-FM在子10M TSFMs中实现了最先进的零样本精度，同时提供低于0.1秒的CPU推理，在预测精度和实时部署能力之间取得了强平衡。

英文摘要

Pretrained time series foundation models (TSFMs) have enabled zero-shot forecasting on unseen target series. However, existing TSFMs often incur high computational cost and provide limited support for diverse variable types, often failing to account for covariates that exogenously influence target variability. To address these challenges, we propose CITRAS-FM, a tiny 7M-parameter TSFM that supports univariate, multivariate, and covariate-informed zero-shot forecasting with real-time CPU inference. Built on a patch-based, decoder-only Transformer, CITRAS-FM introduces Shifted Attention into the cross-variate module to effectively exploit known covariates accessible throughout the forecast horizon. Moreover, to enable covariate-aware pretraining despite the scarcity of covariate-rich corpora, we propose CovSynth, which synthesizes realistic covariates from decomposed components of target series. Experiments on fev-bench, spanning 100 tasks across various settings, demonstrate that CITRAS-FM achieves state-of-the-art zero-shot accuracy among sub-10M TSFMs while delivering sub-0.1-second CPU inference, offering a strong balance between forecasting accuracy and real-time deployability.

URL PDF HTML ☆

赞 0 踩 0

2606.10789 2026-06-10 cs.LG 新提交

Closing the Modality Gap in Zero-Shot HAR: Contrastive Training and Separability-Optimized Prototypes on IMU Data

缩小零样本HAR中的模态差距：基于IMU数据的对比训练与可分性优化原型

Anik Ghosh

AI总结针对IMU基零样本人体活动识别中的模态差距问题，提出对比训练与描述性原型结合的方法，在PAMAP2数据集上实现73.2%准确率和0.583宏F1，并指出宏F1更适合作为评估指标。

详情

Comments: 17 pages, 7 figures

AI中文摘要

基于惯性测量单元（IMU）的人体活动识别（HAR）中的零样本学习（ZSL）面临一个核心挑战：弥合传感器嵌入与语义类表示之间的差距。我们在PAMAP2数据集上系统评估了三种推理方法与两种训练流程组合的七种配置，使用14个已知和4个未知活动类别，并保留受试者108和109用于测试。我们发现模态差距是一个由编码器目标决定的训练时现象。使用标签名称的Sentence-BERT原型进行交叉熵训练的时间卷积网络（TCN）产生的传感器嵌入与对应文本原型的平均余弦相似度为0.30，而将标签名称原型目标替换为判别性活动描述后，该值提升至0.69。这种对齐改进在所有三种推理方法中一致迁移。最强的结果结合了对比训练与反向softmax校正，在未知类别上达到73.2%的准确率和0.583的宏F1，而标签名称基线仅为58.3%准确率和0.34宏F1。另一个发现是，更丰富的文本描述降低了Sentence-BERT空间中原型间的可分性，因为共享的生物力学词汇导致语言模型压缩了原型云。只要原型描述保留足够的判别性词汇，这种效应不会抵消对比对齐的好处。我们还证明，当测试集类别分布不平衡时，总体准确率是一个误导性的主要指标，并推荐宏平均F1作为ZSL-HAR基准的标准报告指标。

英文摘要

Zero-shot learning (ZSL) for inertial measurement unit (IMU)-based human activity recognition (HAR) faces a central challenge: bridging the gap between sensor embeddings and semantic class representations. We systematically evaluate seven configurations combining three inference methods with two training pipelines on the PAMAP2 dataset, using 14 seen and 4 unseen activity classes with subjects 108 and 109 held out for testing. We find that the modality gap is a training-time phenomenon governed by the encoder objective. A temporal convolutional network (TCN) trained with cross-entropy over label-name Sentence- BERT prototypes yields sensor embeddings with a mean cosine similarity of 0.30 to the corresponding text prototypes, while replacing the label-name prototype targets with discriminative activity descriptions raises this to 0.69. This alignment improvement transfers consistently across all three inference methods. The strongest result combines contrastive training with inverted softmax correction, achieving 73.2% accuracy and 0.583 macro F1 on unseen classes, compared to 58.3% accuracy and 0.34 macro F1 for the label-name baseline. A secondary finding is that richer text descriptions reduce inter-prototype separability in Sentence-BERT space, because shared biomechanical vocabulary causes the language model to compress the prototype cloud. This effect does not negate the benefits of contrastive alignment provided prototype descriptions retain sufficient discriminative vocabulary. We also demonstrate that overall accuracy is a misleading primary metric when test-set class distributions are imbalanced, and recommend macro-averaged F1 as the standard reporting metric for ZSL-HAR benchmarks.

URL PDF HTML ☆

赞 0 踩 0

2606.10787 2026-06-10 cs.AI cs.LO 新提交

Accelerating NeurASP with vectorization and caching

通过向量化和缓存加速NeurASP

Alexander Philipp Rader, Alessandra Russo

AI总结本文通过向量化、批处理和缓存中间计算，显著加速了神经符号框架NeurASP的训练，在大型任务上实现了多个数量级的提速。

详情

Comments: 16 pages, 5 figures, to be published in the Theory and Practice of Logic Programming (TPLP) journal for the 42nd International Conference on Logic Programming (ICLP) issue

AI中文摘要

神经符号AI将神经网络与符号程序相结合，以创建鲁棒且可解释的预测。其中一个框架是NeurASP，它训练神经网络来预测概念，并使用答案集编程（ASP）编写的规则对这些概念进行推理，以解决下游任务。关键的是，标签仅由符号规则产生的下游预测提供，而不是潜在概念。通过不可微的ASP组件进行反向传播需要昂贵的概率和梯度计算，这阻碍了其扩展到更复杂的任务。在本文中，我们通过向量化、批处理和训练期间中间计算的缓存来改善NeurASP的计算性能，从而解决其当前局限性。我们比较了原始NeurASP和新实现的计算速度，并报告了在较大任务上多个数量级的加速。为此，我们提出了一个涉及扑克牌的困难任务新数据集，用于测试NeurASP增强学习功能的能力。

英文摘要

Neurosymbolic AI combines neural networks with symbolic programs to create robust and explainable predictions. One such framework is NeurASP, which trains a neural network to predict concepts and reasons over them using rules written in answer set programming (ASP) to solve downstream tasks. Crucially, labels are only provided for the downstream prediction produced by the symbolic rules, not for the latent concepts themselves.Backpropagation through the non-differentiable ASP component requires expensive probability and gradient calculations, which has hindered scalability to more sophisticated tasks.In this paper, we address the current limitations of NeurASP by improving its computational performance through vectorization, batch processing and caching of intermediate computations during training. We compare computation speeds between the original and our new implementation of NeurASP and report speedups of multiple orders of magnitude for larger tasks. To this end, we propose a new dataset of difficult tasks involving playing cards, which we use to test the capabilities of NeurASP's enhanced learning function.

URL PDF HTML ☆

赞 0 踩 0

2606.10782 2026-06-10 cs.CR cs.AI cs.LG 新提交

A Bayesian Network Approach for Enhancing Security-Focused Decision Support Systems

一种增强安全导向决策支持系统的贝叶斯网络方法

Carolina Fernández-Martínez, Shuaib Siddiqui, Vanesa Daza

AI总结提出基于贝叶斯网络的决策支持系统，帮助基础设施运营商选择安全工具，通过捕获用户需求并推理，提供最优安全机制，评估了时间和预测精度。

详情

DOI: 10.1109/LCN65610.2025.11146363
Journal ref: Proc. 2025 IEEE 50th Conference on Local Computer Networks (LCN), 2025

AI中文摘要

当今大多数基于开源网络的异构栈的采用和集成带来了明显的优势，如互操作性和高级功能的可用性。然而，另一方面，互联组件和移动部件数量的增加需要维护跨不同领域的不同工具的跨学科知识基础，以确保正常运行。为了减轻这些工作，本文提出了一种决策支持系统（DSS），指导基础设施运营商选择在其环境中采用的安全方法（例如工具）。该框架能够轻松捕获最终用户对不同领域安全三元组的高层需求，并在指定模型上运行推理，以提供更好地满足这些需求的已识别工具（安全机制）。所提出的DSS旨在提供一个可理解和可扩展的框架，以适应不同的需求和贝叶斯网络（BN）模型。提出了系统的架构和建模，并与其理论框架保持一致。其性能在时间和预测精度方面进行了评估。

英文摘要

The adoption and integration of heterogeneous stacks in most of today's open-source based networks brings clear benefits like interoperability and availability of advanced features. Yet, on the other hand the increasing number of interconnecting components and moving parts requires maintaining an ever increasing base of interdisciplinary knowledge of different tools in different domains to ensure proper operation. To alleviate such efforts, this work proposes a Decision Support System (DSS) to guide infrastructure operators through the selection of security approaches (e.g. tools) to adopt in their environments. This framework easily captures the end-user high-level requirements on the security triad for different domains and runs inference on the designated models to provide the identified tools (security mechanisms) that better serve such needs. The presented DSS aims at delivering an understandable and extensible framework to accommodate varying requirements and Bayesian Network (BN) models. The architecture and modelling of the system are proposed, aligned with its theoretical framework. Its performance is evaluated in terms of time and prediction accuracy.

URL PDF HTML ☆

赞 0 踩 0

2606.10774 2026-06-10 cs.LG cs.DC 新提交

Inverse Probability Weighting and Age-of-Information Aggregation for Decentralized Federated Learning under Partial Reception

部分接收下分散式联邦学习的逆概率加权与信息年龄聚合

Chanuka A. S. Hewa Kaluannakkage, Rajkumar Buyya

AI总结针对无线网络下分散式联邦学习的选择偏差和更新过时问题，提出结合逆概率加权与信息年龄加权的DFL-AA方法，理论消除链路质量偏差，实验优于现有基线。

详情

Comments: 14 pages, 8 figures, research paper for journal submission

AI中文摘要

在有损无线网络上的分散式联邦学习面临两个关键挑战：选择偏差，即由于部分模型接收，来自劣质链路的更新被系统性地低估；以及更新过时，即异步节点贡献过时信息。我们表明，使用局部填充重建的均匀八卦聚合会引入持久的链路质量诱导偏差，而基于完整性的加权进一步放大了这种效应。为了解决这些挑战，我们提出了DFL-AA（具有自适应AoI加权聚合的分散式联邦学习），它结合了逆概率加权与基于在线EWMA的信道估计来纠正选择偏差，以及基于信息年龄的加权来减轻过时，而无需全局同步。我们从理论上证明DFL-AA在期望上消除了链路质量失真，并通过实验证明在不同丢包率、网络规模和异构无线条件下，其性能持续优于最先进的基线。

英文摘要

Decentralized Federated Learning (DFL) over lossy wireless networks faces two key challenges: selection bias, where updates from poor-quality links are systematically underrepresented due to partial model reception, and update staleness, where asynchronous nodes contribute outdated information. We show that uniform gossip aggregation with local-fill reconstruction introduces persistent link-quality-induced bias, while completeness-based weighting further amplifies this effect. To address these challenges, we propose DFL-AA (Decentralized Federated Learning with Adaptive AoI-weighted Aggregation), which combines Inverse Probability Weighting with online EWMA-based channel estimation to correct selection bias and Age-of-Information-based weighting to mitigate staleness without requiring global synchronization. We theoretically show that DFL-AA removes link-quality distortion in expectation and experimentally demonstrate consistent improvements over state-of-the-art baselines across varying loss rates, network sizes, and heterogeneous wireless conditions.

URL PDF HTML ☆

赞 0 踩 0

2606.10765 2026-06-10 cs.CL 新提交

ArabiGEE: A Hierarchical Taxonomy for Arabic Grammatical Error Explanation

ArabiGEE：阿拉伯语语法错误解释的层次分类体系

Khaled Elhady, Omar Kallas, Nizar Habash, Bashar Alhafni

AI总结提出首个基于显式错误类型的阿拉伯语语法错误解释层次分类体系，涵盖正字法、形态、句法和词汇四个维度，包含27种错误类型、140种修正类型和324种解释，并用于人工标注现有语料库以支持大语言模型的自动评估。

2606.10749 2026-06-10 cs.CR cs.AI 新提交

Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation

迈向安全的LLM智能体：威胁面、攻击、防御与评估

Yuchen Ling, Shengcheng Yu, Zhenyu Chen, Chunrong Fang

AI总结本文通过生命周期、系统导向的框架，综合247篇论文，围绕信息流、委托权限和持久状态，系统梳理了LLM智能体的威胁面、攻击、防御与评估，指出提示注入和工具控制流劫持仍是主要威胁，持久状态损坏和多智能体传播成为新兴关注点。

详情

AI中文摘要

大型语言模型（LLM）智能体正迅速从对话界面转变为规划、调用工具、维护记忆并在外部环境中行动的软件组件。这一转变改变了安全风险的性质。在智能体场景中，故障不再局限于不安全的文本生成。不受信任的内容可能重定向控制流、滥用工具权限、破坏持久状态、泄露敏感信息或触发有害的外部行动。与此同时，关于LLM智能体安全的研究正在迅速扩展，但仍然分散在不同的攻击家族、防御层、应用领域和评估设置中。本文通过一个基于生命周期、系统导向的框架综合了247篇论文，该框架围绕信息流、委托权限和持久状态的交互对智能体安全进行建模。我们围绕四个问题组织文献：LLM智能体安全应如何建模，哪些威胁面和攻击家族占主导地位，提出了哪些防御措施及其权衡，以及安全声明如何评估。我们发现，提示注入和工具介导的控制流劫持仍然主导该领域，而持久状态损坏和多智能体传播正在成为新兴的核心关注点。我们进一步发现，当前的防御措施提供了有用的构建块，但组合性较弱，现有的基准测试仍然低估了长期、有状态和部署敏感的风险。我们认为，安全的LLM智能体需要明确的信任边界、有原则的权限控制、可溯源的状态管理以及与真实操作环境一致的评估实践。

英文摘要

Large language model (LLM) agents are rapidly moving from conversational interfaces to software components that plan, invoke tools, maintain memory, and act on external environments. This transition changes the nature of security risk. In agentic settings, failures are no longer limited to unsafe text generation. Untrusted content may redirect control flow, misuse tool privileges, corrupt persistent state, leak sensitive information, or trigger harmful external actions. At the same time, research on LLM agent security is expanding quickly but remains fragmented across attack families, defense layers, application domains, and evaluation settings. This paper synthesizes 247 papers through a lifecycle-based, systems-oriented framework that models agent security around the interaction of information flow, delegated authority, and persistent state. We organize the literature around four questions: how LLM agent security should be modeled, which threat surfaces and attack families dominate, what defenses have been proposed and with what tradeoffs, and how security claims are evaluated. We find that prompt injection and tool-mediated control-flow hijacking still dominate the field, while persistent state corruption and multi-agent propagation are becoming central emerging concerns. We further find that current defenses provide useful building blocks but remain weakly compositional, and that existing benchmarks still underrepresent long-horizon, stateful, and deployment-sensitive risks. We argue that secure LLM agents require explicit trust boundaries, principled privilege control, provenance-aware state management, and evaluation practices aligned with realistic operational settings.

URL PDF HTML ☆

赞 0 踩 0

2606.10742 2026-06-10 cs.CR cs.LG 新提交

MemVenom: Triggered Poisoning of Multimodal Memories in Web Agents

MemVenom：网络代理中多模态记忆的触发式投毒

Yv Zhang, Hao Sun, Hao Fang, Kuofeng Gao, Fan Mo, Bin Chen, Shu-Tao Xia, Yaowei Wang

AI总结提出MemVenom框架，针对网络代理的外部记忆系统，通过触发条件检索和攻击诱导，实现黑盒多模态记忆投毒，达到高成功率且不影响良性性能。

详情

Comments: Preprint. 27 pages, 6 figures, 6 tables

AI中文摘要

外部记忆已成为现代网络代理的核心组件，通过检索过去经验实现长程推理。然而，这种范式引入了一个关键漏洞：注入记忆中的恶意内容可以被持续召回并反复影响代理行为。在这项工作中，我们识别并系统研究了多模态记忆投毒——网络代理系统中一个被忽视但实际存在的攻击面。我们提出MemVenom，一个统一的黑盒攻击框架，通过协调的文本-图像证据对图结构外部记忆进行投毒。我们的方法包括两阶段设计：(1) 触发条件检索攻击，确保恶意记忆的高概率召回；(2) 检索后攻击诱导，利用对抗性扰动和隐蔽OCR注入覆盖原始用户目标。与先前针对提示或纯文本记忆的攻击不同，我们的方法无需修改模型参数或重新优化恶意任务，即可实现持久、可重用且目标无关的攻击。在多个网络代理框架和视觉语言模型上的实验表明，MemVenom在最小化对良性性能影响的同时，实现了强大的端到端攻击成功率，在GPT-5系列网络代理上达到99.15%，并在不同架构和模型规模间有效迁移。

英文摘要

External memory has become a core component of modern web agents, enabling long-horizon reasoning through the retrieval of past experiences. However, this paradigm introduces a critical vulnerability: malicious content injected into memory can be persistently recalled and repeatedly influence agent behavior. In this work, we identify and systematically study multimodal memory poisoning, an overlooked yet practical attack surface in web-agent systems. We propose MemVenom, a unified black-box attack framework that poisons graph-structured external memory with coordinated text-image evidence. Our method consists of a two-stage design: (1) a trigger-conditioned retrieval attack that ensures high-probability recall of malicious memory, and (2) a post-retrieval attack induction that leverages adversarial perturbations and stealthy OCR injection to override the original user objective. Unlike prior attacks that operate on prompts or text-only memory, our approach enables persistent, reusable, and goal-agnostic attacks without modifying model parameters or re-optimizing malicious tasks. Experiments across multiple web-agent frameworks and vision-language models demonstrate that MemVenom achieves strong end-to-end attack success with minimal impact on benign performance, reaching up to 99.15% on GPT-5-family web agents, while transferring effectively across architectures and model scales.

URL PDF HTML ☆

赞 0 踩 0

2606.10740 2026-06-10 cs.AI cs.CL cs.LG 新提交

When the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning Models

当思维链更清楚时：多轮推理模型的失败模式

Sai Kartheek Reddy Kasu, Nils Lukas, Samuele Poppi

AI总结提出CoT-Output 2x2安全矩阵诊断多轮推理模型隐藏的时间动态失败，发现监督悖论和上下文注入失败两种可复现漏洞。

详情

Comments: Accepted at the ICML 2026 FAGEN Workshop

AI中文摘要

多轮推理模型中的失败在终端评分评估中基本不可见。模型可能在长对话早期锁定不安全立场，但其最终轮拒绝率可能看起来与稳健对齐的基线无法区分。为了揭示这些隐藏的时间动态，我们提出了一种轨迹级诊断方法——CoT-Output 2x2安全矩阵。该框架沿两个独立轴（内部推理和可见输出）标记每一轮，产生四个操作定义的失败单元：稳健对齐、对齐伪装、显式越狱，以及我们称为上下文注入失败的不同失败模式（其中CoT保持安全推理，但可见输出产生危害，突出了多轮推理不忠实的表现）。我们在五个监督条件下针对固定攻击者评估了三个蒸馏推理目标，在信息危害场景上收集了6750个轮级观察。我们的分析揭示了两个可复现的漏洞：一个监督悖论，其中显式监控线索反而增加对齐伪装率而非抑制它；以及一个上下文注入失败，其中模型尽管内部状态安全却锁定不安全的外部输出。我们发布了多轮对话和CoT轨迹的完整数据集，以支持后续的轨迹诊断研究。

英文摘要

Failures in multi-turn reasoning models are largely invisible to terminal-score evaluation. A model can lock onto an unsafe stance early in a long dialogue, yet its final-turn refusal rate may appear indistinguishable from a robustly aligned baseline. To expose these hidden temporal dynamics, we propose a trace-level diagnostic - the CoT-Output 2x2 safety matrix. This framework labels every turn along two independent axes (internal reasoning and visible output), yielding four operationally defined failure cells: robust alignment, alignment faking, overt jailbreak, and a distinct failure mode we term context-injection failure (where the CoT maintains safe reasoning, but the visible output produces harm, highlighting a multi-turn manifestation of reasoning unfaithfulness). We evaluate three distilled reasoning targets against a fixed attacker across five oversight conditions, collecting 6750 turn-level observations on the Information-Hazard scenario. Our analysis reveals two reproducible vulnerabilities: an oversight paradox where explicit monitoring cues paradoxically increase alignment-faking rates rather than suppress them, and a context-injection failure where models lock onto unsafe external outputs despite safe internal states. We release the full dataset of multi-turn dialogues and CoT traces to support follow-up trace-diagnostic research.

URL PDF HTML ☆

赞 0 踩 0

2606.10709 2026-06-10 cs.IR cs.AI 新提交

Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training

通过训练期间回收零方差查询实现智能体搜索的有效强化学习

João Coelho, João Magalhães, Bruno Martins, Chenyan Xiong

AI总结提出查询回收方法，将训练中零方差查询重新投入采样池，使有效训练分布随策略演化，1.7B模型在7个多跳QA基准上平均Pass@1达66.0，匹配或超越7B模型。

详情

AI中文摘要

使用GRPO风格的算法已成为在仅结果奖励下训练LLM搜索代理的标准策略。使用这些算法时，只有当查询的 rollout 组混合了成功和失败时，该查询才对参数更新有贡献；全正确（太容易）和全错误（太难）的组是零方差的，浪费了 rollout 成本。现有方法将零方差视为静态属性，要么丢弃要么预过滤这些组。我们假设并通过实验验证，随着训练过程中策略的演化，查询会在零方差和信号承载状态之间翻转。基于这一直觉，我们提出查询回收，将零方差组返回到可变池中以供将来重新采样，从而使有效训练分布与策略共同演化。使用所提出的技术，在合成数据上训练的1.7B参数模型在七个多跳QA基准上平均达到66.0的Pass@1，匹配或超越使用基准监督训练的、参数高达7B的系统。回收模式分析表明，到训练结束时，回收的查询提供了大约四分之三的有效批次，贡献在策略改进恢复和策略漂移之间分配。

英文摘要

The use of GRPO-style algorithms has become the standard strategy for training LLM search agents under outcome-only rewards. With these algorithms, a query contributes to parameter updates only when its rollout group mixes successes and failures; all-correct (too-easy) and all-incorrect (too-hard) groups are zero-variance and waste rollout cost. Existing approaches treat zero-variance as a static property and either discard or pre-filter such groups. We hypothesize and empirically validate that queries flip between zero-variance and signal-bearing states as the policy evolves during training. Building on this intuition, we propose query recycling, which returns zero-variance groups to a mutable pool for future resampling, so that the effective training distribution co-evolves with the policy. With the proposed technique, a 1.7B parameter model trained on synthetic data can reach 66.0 average Pass@1 accross seven multi-hop QA benchmarks, matching or surpassing systems with up to 7B parameters trained on benchmark-derived supervision. Analysis of recycling patterns shows that recycled queries supply roughly three quarters of the effective batch by the end of training, with contributions split between recovery from policy improvement and policy drift.

URL PDF HTML ☆

赞 0 踩 0

2606.10705 2026-06-10 cs.LG cs.AI cs.SY eess.SY 新提交

Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication

事件驱动强化学习实现半导体制造中的长时域控制

Yavar Yeganeh, Mahsa Shekari, Nicla Frigerio, Daniele Pagano, Andrea Matta

AI总结提出事件驱动深度强化学习框架，将半导体制造控制建模为中心化智能体问题，通过事件驱动时序差分方法优化多目标策略，在高保真仿真中显著提升吞吐量和利用率。

详情

AI中文摘要

强化学习有望优化大规模系统中的序贯决策。半导体制造系统是随机且高度约束的环境，其中异构晶圆在广泛的设备网络中经历数百个加工步骤。这些特性产生了复杂、高维的决策问题，具有延迟反馈和长时域要求，使生产计划和控制复杂化。我们提出了一个用于此规模的多目标策略优化的深度强化学习框架。具体来说，我们将控制表述为一个中心化智能体问题，其中核心策略协调系统范围的决策，而系统演化被表示为由离散事件驱动的互联时间过程。相应地，我们开发了一个定制的事件驱动时序差分公式，该公式保持通用性，并可在相关训练设置下与各种策略优化方法集成。我们研究了纳入该框架的几种核心无模型算法，并使用不同工业现实操作场景的高保真仿真评估其有效性。在广泛的验证实验中，在离线和在线设置下训练的智能体在吞吐量和利用率方面显示出显著且一致的提升。我们进一步评估了训练阶段的表现和泛化能力，阐明了替代强化学习公式和算法的相对优势。总体而言，结果支持所提出框架在控制事件驱动复杂自适应系统方面的可扩展性、通用性和可迁移性。

英文摘要

Reinforcement learning promises to optimize sequential decisions in large-scale systems. Semiconductor manufacturing systems are stochastic and highly constrained environments where heterogeneous wafers traverse hundreds of processing steps across extensive equipment networks. These characteristics yield complex, high-dimensional decision problems with delayed feedback and long-horizon requirements, complicating production planning and control. We propose a deep reinforcement learning framework for multi-objective policy optimization at this scale. Specifically, we formulate control as a centralized-agent problem, where a core policy coordinates system-wide decisions, while system evolution is represented as an interconnected temporal process driven by discrete events. Accordingly, we develop a tailored event-driven temporal-difference formulation that remains general and can be integrated with various policy optimization methods under relevant training settings. We investigate several core model-free algorithms incorporated into this framework and evaluate their effectiveness using high-fidelity simulations of diverse, industry-real operating scenarios. Across extensive validation experiments, agents trained in both offline and online settings show significant and consistent gains in throughput and utilization. We further evaluate performance and generalization across training phases, clarifying the relative strengths of alternative reinforcement learning formulations and algorithms. Overall, the results support the scalability, generality, and transferability of the proposed framework for controlling event-driven complex adaptive systems.

URL PDF HTML ☆

赞 0 踩 0

2606.10703 2026-06-10 cs.LG cs.CL 新提交

From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

从观察到干预：混合专家模型中专家重要性的因果审计

Leonard Engmann, Christian Medeiros Adriano, Holger Giese

AI总结通过因果审计发现，混合专家模型中的路由统计指标无法预测专家重要性，现有剪枝方法的成功源于早期层冗余而非识别可删除专家。

详情

Comments: 9 pages, 2 figures, 9 tables. Accepted at the ICML 2026 Workshop on Philosophy of Science Meets Machine Learning (PhilML). Non-archival

AI中文摘要

可解释性方法通常使用观察到的模型行为的总体统计量来推断特定计算的目标干预效果；用Pearl的术语来说，它们将第一层的关联证据视为支持第二层的干预结论，而这种做法的有效性很少被检验。我们考察了一个具体实例：混合专家（MoE）剪枝中路由统计量的使用，其中利用率、激活范数和路由权重分布被视为预测哪些专家可以被移除而不产生功能损失的指标。在三个高冗余MoE架构（OLMoE-1B-7B-0924、Qwen1.5-MoE-A2.7B、DeepSeek-V2-Lite）上进行的token级干预审计发现，经过多重比较校正后，没有任何观测指标能预测任何模型中的因果专家重要性，所有60个指标-层组合的效应量均低于Cohen's $d = 0.17$。通过每个token的路由权重控制排除了统计功效不足的问题，仅在OLMoE的最后一个MoE层恢复了一个Bonferroni显著的信号（$d = +0.231$, $p = 0.0013$）。现有剪枝方法在此场景下的成功并非由于识别了可删除的专家，而是因为早期层的冗余使得大多数选择标准可互换。我们的结果提供了一个明确的反例，表明从总体观测统计量到关于专家重要性的token级干预推断这一常见推理步骤存在问题，并展示了干预审计如何校准可解释性主张的证据标准。

英文摘要

Interpretability methods routinely use population-level summary statistics over observed model behaviour to license claims about the effects of targeted interventions on specific computations; in Pearl's terms, they treat rung-1 associational evidence as if it supported rung-2 interventional conclusions, a move whose validity is rarely tested. We examine one concrete instance: the use of routing statistics in Mixture-of-Experts (MoE) pruning, where utilization rates, activation norms, and routing weight distributions are treated as predictors of which experts can be removed without functional cost. A token-level interventional audit across three high-redundancy MoE architectures (OLMoE-1B-7B-0924, Qwen1.5-MoE-A2.7B, DeepSeek-V2-Lite) finds no observational metric predicts causal expert importance after multiple-comparison correction in any model, with effect sizes below Cohen's $d = 0.17$ across all 60 metric-layer combinations. A per-token routing weight control rules out insufficient power, recovering a single Bonferroni-significant signal at OLMoE's final MoE layer ($d = +0.231$, $p = 0.0013$). Existing pruning methods succeed in this regime not by identifying dispensable experts but because early-layer redundancy renders most selection criteria interchangeable. Our results provide an explicit counterexample to the common inferential step from population-level observational summaries to token-level interventional claims about expert importance, and illustrate how interventional audits can calibrate the evidential standards for interpretability claims.

URL PDF HTML ☆

赞 0 踩 0

2606.10699 2026-06-10 cs.CV cs.AI 新提交

Using the YOLOv12 Model for Verifying the Correct Color Sequence of Wires in Network Cables (Patch Cords) on the Production Line

使用YOLOv12模型验证生产线上网线（跳线）中导线的正确颜色顺序

Amin Doroodchi, Danial Soleimany

AI总结针对网线生产中导线颜色顺序检测问题，提出基于YOLOv12的目标检测模型，实现高精度实时验证，减少人工错误。

详情

AI中文摘要

在网络电缆的生产过程中，确保标准连接器内部线对的正确颜色顺序对电缆的最终性能至关重要，因为任何错位或颜色顺序错误都可能导致缺陷产品并造成巨大成本。基于数字显微镜目视检查的传统检测方法通常耗时、繁琐且容易出错。在本研究中，开发了一种基于第十二版YOLO目标检测模型的智能系统，用于识别跳线中导线的位置并验证其正确的颜色顺序。使用的数据集包括从网络连接器显微视图中捕获的2500张图像，其中70%用于训练，15%用于验证，15%用于测试。所提出的模型利用单阶段架构和学习过程中的注意力机制，实现了约98%精度的导线检测。此外，总体平均准确率、分类精度和召回率分别约为95%、99%和98%。结果表明，该系统能够在生产线上可靠地实时验证导线颜色顺序的正确性，无需人工干预，从而减少人为错误并提高制造效率。

英文摘要

In the production process of network cables, ensuring the correct color sequence of wire pairs inside the standard connector plays a critical role in the final performance of the cable, as any misplacement or color-ordering error can lead to defective products and impose significant costs. Traditional inspection methods based on visual examination through digital microscopes are typically time-consuming, tedious, and prone to human error. In this study, an intelligent system based on the twelfth version of the YOLO1 object detection model was developed to identify the position and verify the correct color sequence of wires in patch cords. The dataset used consisted of 2,500 images captured from microscopic views of network connectors, which were divided into 70% for training, 15% for validation, and 15% for testing. The proposed model, leveraging a single-stage architecture and attention mechanisms during learning, achieved highly accurate wire detection with approximately 98% precision. Additionally, the overall mean accuracy, classification precision, and recall were around 95%, 99%, and 98%, respectively. The results demonstrate that this system can reliably and in real time verify the correctness of wire color sequencing on the production line without the need for human intervention, thereby reducing human error and enhancing efficiency in the manufacturing process.

URL PDF HTML ☆

赞 0 踩 0

2606.10692 2026-06-10 cs.CR cs.LG 新提交

Do LLMsMakeNeural Distinguishers Wise?

LLM 是否使神经区分器更智能？

Tatsuya Sakagami, Masashi Hisai, Naoto Yanai

AI总结本文提出基于大语言模型（LLM）的神经区分器，通过提示设计在SPECK-32/64上实验，发现LLM未显著提升性能，高轮次下差分选择失效，但加入XOR结果可改善性能。

详情

Journal ref: DeMeSSAI 2026 poster

AI中文摘要

神经区分器是一种对称密钥密码的密码分析方法，它通过训练机器学习模型于具有特定差分的明文-密文对来恢复密钥。据我们所知，现有工作尚未探索使用大语言模型（LLM）进行神经区分器。在本文中，我们通过提示设计提出了基于LLM的神经区分器，并在SPECK-32/64上对其进行了广泛实验，以研究LLM能否增强神经区分器。然后，我们发现了三个关键见解。第一，通过将基于LLM的神经区分器与现有工作中的ResNet结果进行比较，我们证明LLM在神经区分器性能上没有提供可观察到的改进。第二，我们确认在高轮次下，差分的选择对基于LLM的神经区分器以及ResNet不再有效。第三，我们表明，通过仅将XOR运算结果作为提示设计，可以显著提高基于LLM的神经区分器的性能。

英文摘要

Neural distinguishers are a cryptanalysis method for symmetric-key cryptography that trains machine learning models on pairs of plaintexts and ciphertexts with specific differences in order to recover a secret key. To the best of our knowledge, no existing work has explored the use of large language models (LLMs) for neural distinguishers. In this paper, we propose LLM-based neural distinguishers through a prompt design and conduct extensive experiments with them on SPECK-32/64 to investigate whether LLMs can strengthen neural distinguishers. We then found three key insights. First, by comparing the results of LLM-based neural distinguishers with ResNet in the existing work, we demonstrate that LLMs provide no observable improvement in the performance of neural distinguishers. Second, we confirm that, at high rounds, the choice of differences is no longer effective for LLM-based neural distinguishers as well as ResNet. Third, we show that the performance of LLM-based neural distinguishers can be significantly improved by incorporating only the XOR operation results as a prompt design.

URL PDF HTML ☆

赞 0 踩 0

2606.10688 2026-06-10 cs.RO 新提交

Self-Supervised Relevance Modelling in Autonomous Driving via Counterfactual Analysis

自动驾驶中基于反事实分析的自监督相关性建模

Luca Lusvarghi, Javier Gozalvez, Pablo Urbano Hidalgo

AI总结提出一种基于反事实分析的自监督方法，用于量化自动驾驶中物体的相关性，实现毫秒级实时估计，并生成相关性热图以辅助感知与规划。

详情

AI中文摘要

自动驾驶依赖于计算密集型的感知管线，以持续检测和跟踪周围环境中的物体。虽然某些物体对于规划安全有效的操作至关重要，但其他物体可能不相关，并且对自动驾驶车辆的驾驶决策没有影响。关注相关物体可以更有效地利用可用计算资源，减少处理延迟，并限制感知噪声的下游传播。在这项工作中，我们提出了一种基于反事实分析的新型自监督方法，以开发相关性模型——一种基于AI的工具，用于量化物体对自动驾驶车辆的相关性。为了展示所提出方法的潜力，我们在选定城市场景中生成的合成因果数据集上训练了相关性模型。结果表明，该相关性模型能够以毫秒级延迟准确估计物体的相关性，从而在高密度场景中实现实时相关性估计。我们还展示了该相关性模型可用于构建相关性热图，为自动驾驶车辆的驾驶策略提供有价值的见解，并可用于主动通知感知和规划任务。我们公开发布了相关性模型和因果数据集。

英文摘要

Autonomous driving relies on computationally intensive perception pipelines to continuously detect and track objects in the surrounding environment. While some objects are key to plan safe and effective maneuvers, others may not be relevant and have no impact on the autonomous vehicle's driving decisions. Focusing on relevant objects allows a more efficient usage of available computational resources, reduces processing latencies, and limits the downstream propagation of perception noise. In this work, we propose a novel self-supervised approach based on counterfactual analysis to develop a relevance model - an AI-based tool that quantifies the relevance of objects for an autonomous vehicle. To demonstrate the potential of the proposed approach, we train a relevance model on a synthetic causal dataset generated in a selected urban scenario. Results show that the relevance model is able to accurately estimate the objects' relevance with millisecond-level latency, enabling real-time relevance estimation also in high-density scenarios. We also show that the relevance model can be used to build relevance heatmaps that offer valuable insights into the autonomous vehicle's driving policy and can be used to proactively inform perception and planning tasks. We openly release both the relevance model and the causal dataset.

URL PDF HTML ☆

赞 0 踩 0

2606.10669 2026-06-10 cs.LG cs.AI cs.CR 新提交

In Defense of Information Leakage in Concept-based Models

为基于概念模型中的信息泄露辩护

Mateo Espinosa Zarlenga

AI总结本文重新审视基于概念模型中的信息泄露问题，提出良性泄露概念，通过优化训练目标，在概念不完整时利用泄露提升准确性和可干预性。

详情

Comments: Accepted as a position paper at the Forty-Third International Conference on Machine Learning (ICML 2026)

AI中文摘要

基于概念的模型（CMs）是深度神经网络，其预测基于与人类可理解概念（如“圆形”、“条纹”等）对齐的表示。已有研究表明，这些模型会学习到泄露概念无关信息的表示。传统观点认为，这种泄露是不可取的，应予以消除，因为它会导致模型不可解释。在本文中，我们认为这种关于CMs中泄露的传统观点不仅是不恰当的（因为泄露如何使模型更不可解释的证据往往不明确），而且在常见的现实约束下必然导致不实用的CMs。具体来说，我们认为在概念不完整是常态的现实环境中，为了构建准确且可干预的CMs，某种程度的泄露往往是必要的。为此，我们提出存在所谓的良性泄露，并表明通过重新优化典型的CM训练目标，CMs可以鼓励并利用这种形式的泄露，而不会牺牲准确性或可干预性。

英文摘要

Concept-based models (CMs), deep neural networks that ground their predictions on representations aligned with human-understandable concepts (e.g., "round", "stripes", etc.), have been shown to learn representations that leak concept-irrelevant information. As the traditional narrative goes, this leakage is undesirable and should be eradicated as it leads to uninterpretable models. In this paper, we posit that this conventional view of leakage in CMs is not only ill-posed, as the evidence of how leakage makes a model less interpretable is often inconclusive, but also bound to lead to impractical CMs under common real-world constraints. Specifically, we argue that in real-world settings where concept incompleteness is the norm, some leakage is often necessary for constructing accurate and intervenable CMs. To this end, we propose that there is such a thing as benign leakage and show that, by optimizing a reframing of the typical CM training objective, CMs can encourage and exploit this form of leakage without sacrificing accuracy or intervenability.

URL PDF HTML ☆

赞 0 踩 0