arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.08777 2026-05-12 stat.ML cs.LG math.PR

Measuring and Decomposing Mode Separation via the Canonical Diffusion

Shaul Tolkovsky, Ori Meidler, Or Zuk

AI总结本文研究了密度分布中模式分离的度量问题，即分布如何形成被势垒分隔的簇状结构，这一特性在高维空间中难以量化。作者提出了一种基于密度平稳分布的可逆扩散过程，通过其自协方差矩阵提取两个指标：SSA（平方自相关和）用于衡量势垒敏感的分离程度，DA（主导自相关方向）用于捕捉元稳态结构。该方法仅需样本和分数函数，适用于高维数据，并在合成混合高斯、文本到图像生成和分子动力学等场景中验证了其有效性。

2605.08766 2026-05-12 cs.IR cs.CL

UserGPT Technical Report

Yunyi Xuan, Hao Yi, Fengling Mao, Daye Cai, Leikun Liang, Xingsheng He, Jiangnan Xie, Guoshuai Wang, Yushan Han, Wenwen Guo, Xiaoxiao Xu, Lin Qu

AI总结本文研究了如何利用大语言模型（LLM）从大规模数字痕迹中生成连贯的用户叙事，以实现更准确和个性化的用户理解。为解决真实行为数据稀缺的问题，作者提出了UserGPT框架，包含用户行为模拟引擎和语义化数据处理模块，并设计了基于课程学习的微调策略，以提升模型对长期行为历史的推理能力。实验表明，UserGPT在生成用户标签和行为摘要任务中表现出色，显著压缩了行为记录的同时保留了关键信息。

2605.08761 2026-05-12 cs.MA cs.LG

Beyond the All-in-One Agent: Benchmarking Role-Specialized Multi-Agent Collaboration in Enterprise Workflows

Tao Yu, Hao Wang, Changyu Li, Shenghua Chai, Minghui Zhang, Zhongtian Luo, Yuxuan Zhou, Haopeng Jin, Zhaolu Kang, Jiabing Yang, YiFan Zhang, Xinming Wang, Hongzhu Yi, Zheqi He, Jing-Shu Zheng, Xi Yang, Yan Huang, Liang Wang

AI总结该研究提出了一种名为 EntCollabBench 的新基准，用于评估多智能体在企业工作流中的协作能力。该基准模拟了一个具有权限隔离和角色分工的组织环境，包含六个部门的11个专业化角色，并设计了工作流和审批两个评估子集，强调系统状态修改与基于政策的决策。实验表明，当前大型语言模型代理在端到端协作、任务交接和决策承诺等方面仍存在明显不足，该基准为企业级智能体系统的评估与改进提供了可复现的测试平台。

Comments 45 pages

2605.08744 2026-05-12 cs.GR cs.AI cs.LG

MeshFIM: Local Low-Poly Mesh Editing via Fill-in-the-Middle Autoregressive Generation

Dingdong Yang, Jian Liu, Biwen Lei, Haohan Weng, Zhuo Chen, Song Guo, Hao Richard Zhang, Ali Mahdavi Amiri, Chunchao Guo

AI总结本文提出了一种名为MeshFIM的局部低多边形网格编辑方法，通过条件生成的方式仅重新生成不满意区域，而无需重新生成整个网格，从而节省计算资源并保持其他区域的结构完整性。MeshFIM针对网格编辑中的边界对齐、拓扑保持和区域溢出等挑战，设计了包括边界顶点标记、上下文位置嵌入、扩展上下文宽度等多种关键技术，有效提升了局部编辑的质量与效率。实验表明，MeshFIM在网格修复和整体生成任务中优于多种基线方法，并支持交互式编辑和自动缺陷修复等应用。

2605.08687 2026-05-12 cs.DB cs.AI

PrepBench: How Far Are We from Natural-Language-Driven Data Preparation?

Jingzhe Xu, Rui Wang, Jiannan Wang, Guoliang Li

AI总结数据准备是数据分析流程中的核心且耗时环节，传统工具依赖图形用户界面进行操作，而近年来大语言模型的发展使得通过自然语言驱动数据准备成为可能。为评估当前基于大语言模型的系统在该方向上的进展，研究提出了PrepBench基准，涵盖交互式消歧、准备代码生成和代码到工作流转换三个核心能力，任务涉及多个领域且步骤复杂，实验表明当前最先进的模型在实现自然语言驱动的数据准备方面仍面临挑战。

2605.08681 2026-05-12 stat.ML cs.AI cs.LG cs.NA math.NA

Core-Halo Decomposition: Decentralizing Large-Scale Fixed-Point Problems

Haixiang, Yang Xu, Jiefu Zhang, Xudong Wu, Zihan Zhou, Jun He, Jiayu Chen

AI总结本文研究如何通过分解方法求解大规模固定点方程 $x^\star = \bar{F}(x^\star)$。传统严格分解方法将变量分配给不同代理，但会导致依赖关系被截断，引入结构性偏差。为此，作者提出核心-边缘（Core-Halo）分解方法，将变量的写操作与读操作分离，使每个代理更新自己的核心变量，同时读取重叠的边缘变量，从而忠实实现原固定点问题。实验表明，该方法在保持去中心化优势的同时，性能接近集中式求解。

2605.08680 2026-05-12 cs.SE cs.AI cs.LG

Semantic Voting: Execution-Grounded Consensus for LLM Code Generation

Shan Jiang, Zijian Yi, Chenguang Zhu

AI总结该论文研究了基于大语言模型（LLM）的代码生成过程中如何在没有完整 oracle 的情况下选择最优代码。作者提出了一种基于语义投票（Semantic Voting）的方法，通过在LLM生成的输入上执行候选代码并根据执行结果进行聚类，从而提高选择准确性。实验表明，基于执行的筛选方法在多个配置上显著优于传统的输出模式投票方法，而输入质量是影响性能的最关键因素，使用草图生成输入比直接生成或随机模糊测试效果更好。

2605.08679 2026-05-12 cs.SI cs.AI cs.LG

Attention-based graph neural networks: a survey

Chengcheng Sun, Chenhao Li, Xiang Lin, Tianji Zheng, Fanrong Meng, Xiaobin Rui, Zhixiao Wang

AI总结本文综述了基于注意力机制的图神经网络（GNNs）的最新进展，系统梳理了其发展历程和典型架构，提出了一个包含三个发展阶段和多种结构类型的两级分类体系。文章详细回顾了各类方法，总结了它们的优缺点，并提供了模型特性对比表，同时探讨了当前面临的挑战与未来研究方向，为相关研究提供了全面的参考资源。

Comments This is the accepted manuscript of an article published in Artificial Intelligence Review. The final version is available online at: [10.1007/s10462-023-10577-2](https://link.springer.com/article/10.1007/s10462-023-10577-2)

详情

DOI: 10.1007/s10462-023-10577-2
Journal ref: Artif Intell Rev 56 (Suppl 2), 2263 2310 (2023)

英文摘要

Graph neural networks (GNNs) aim to learn well-trained representations in a lower-dimension space for downstream tasks while preserving the topological structures. In recent years, attention mechanism, which is brilliant in the fields of natural language processing and computer vision, is introduced to GNNs to adaptively select the discriminative features and automatically filter the noisy information. To the best of our knowledge, due to the fast-paced advances in this domain, a systematic overview of attention-based GNNs is still missing. To fill this gap, this paper aims to provide a comprehensive survey on recent advances in attention-based GNNs. Firstly, we propose a novel two-level taxonomy for attention-based GNNs from the perspective of development history and architectural perspectives. Specifically, the upper level reveals the three developmental stages of attention-based GNNs, including graph recurrent attention networks, graph attention networks, and graph transformers. The lower level focuses on various typical architectures of each stage. Secondly, we review these attention-based methods following the proposed taxonomy in detail and summarize the advantages and disadvantages of various models. A model characteristics table is also provided for a more comprehensive comparison. Thirdly, we share our thoughts on some open issues and future directions of attention-based GNNs. We hope this survey will provide researchers with an up-to-date reference regarding applications of attention-based GNNs. In addition, to cope with the rapid development in this field, we intend to share the relevant latest papers as an open resource at https://github.com/sunxiaobei/awesome-attention-based-gnns.

URL PDF HTML ☆

赞 0 踩 0

2605.08645 2026-05-12 physics.plasm-ph cs.LG

Energy-based models for diagnostic reconstruction and analysis in a laboratory plasma device

Phil Travis, Troy Carter

AI总结本文将基于能量的方法（EBMs）应用于实验室等离子体物理研究，用于诊断数据的重建与分析。研究通过构建能量表面，学习数据的联合概率分布，从而实现对复杂非线性等离子体现象的深入分析与条件采样。作者在大型等离子体装置（LAPD）上训练了一个结合卷积神经网络和注意力机制的EBM模型，展示了其在诊断重建、逆问题求解和异常检测等方面的实用价值，为等离子体物理研究提供了新的分析工具。

Comments 15 pages, 10 figures

2605.08633 2026-05-12 cs.DC cs.CV

Transforming the Use of Earth Observation Data: Exascale Training of a Generative Compression Model with Historical Priors for up to 10,000x Data Reduction

Jinxiao Zhang, Runmin Dong, Xiyong Wu, Xihan Huang, Shenggan Cheng, Yunkai Yang, Zheng Zhou, Yunpu Xu, Zhaoyang Luo, Miao Yang, Fan Wei, Mengxuan Chen, Yang You, Juepeng Zheng, Weijia Li, Yutong Lu, Haohuan Fu

AI总结该研究提出了一种基于历史先验的生成式压缩框架，旨在将地球观测数据的压缩从传统的存储和传输工具转变为一种新型的数据使用方式，实现高达10,000倍的数据压缩比。通过在LineShine Armv9超算上进行超大规模训练，研究团队优化了模型设计、内核、内存层次、运行时和并行性，实现了每秒1.54至2.16 EFLOP的高效训练性能。该方法利用地球观测数据重复测量同一星球的特性，为极端压缩提供了可行方案，展示了历史先验生成压缩在数据获取、传输、存储和科学应用中的巨大潜力。

2605.08626 2026-05-12 eess.SP cs.DC cs.LG cs.MA

Large Language Models over Networks: Collaborative Intelligence under Resource Constraints

Liangqi Yuan, Wenzhi Fang, Shiqiang Wang, H. Vincent Poor, Christopher G. Brinton

AI总结本文研究了在资源受限环境下，如何通过网络中多个设备与云端的大型语言模型（LLM）协作，实现高质量的智能服务。核心方法是提出垂直的设备-云协同和水平的多智能体协作两种互补的协作推理方式，并探讨了协作训练中路由策略的学习与模型间协同能力的提升。主要贡献在于构建了适用于不同资源约束条件的协作智能框架，并指出了在异构资源扩展与可信协作方面的重要研究挑战。

2605.08594 2026-05-12 cs.AR cs.IT cs.LG math.IT

FLARE: One-Shot PE-Level Fault Localization in Systolic Arrays via Algebraic Test Vectors

Logashree Venkatasubramanian, Zishen Wan, Viveck Cadambe

AI总结本文提出了一种名为FLARE的算法方法，用于在系综阵列中实现单次测试的PE级故障定位。该方法通过使用互质测试向量，使得每个PE的故障能够通过其产生的偏差唯一地被识别出来，从而在无需硬件冗余的情况下实现高效的故障定位。实验表明，该方法在INT16算术下能够以高概率定位高达256×256规模阵列中的故障，且测试开销低于一次推理GEMM瓦片的1%。

2605.08590 2026-05-12 cs.HC cs.AI cs.CL cs.CY

Causal Stories from Sensor Traces: Auditing Epistemic Overreach in LLM-Generated Personal Sensing Explanations

Shanshan Zhu, Han Zhang, J. Doris Chi, Subigya Nepal, Koustuv Saha

AI总结该研究探讨了大型语言模型（LLM）在解释个人传感数据时可能出现的“知识越界”问题，即生成的解释超出了可用数据的支持范围。研究通过分析三类大学生活数据集中的异常日场景，使用三种主流LLM生成大量解释，并评估其因果归因、数据缺口、语言自信程度等方面的合理性。结果表明，LLM常在缺乏足够证据的情况下做出因果推断，且提供更多上下文并不能有效减少这一问题，强调了在生成个人传感解释时应重视数据依据的严谨性。

详情

英文摘要

LLMs are increasingly used to explain personal sensing data, translating traces of activity and mood into natural-language accounts of why an anomalous day may have occurred. However, such explanations can sound coherent and personally meaningful even when the underlying evidence is sparse or missing. We introduce epistemic overreach (EO) as a measure for cases where a generated explanation implies more than the available sensing evidence can justify. To audit how often and in what forms EO occurs, we obtained anomalous-day scenarios from three longitudinal sensing datasets of college students: StudentLife, GLOBEM, and CollegeExperience. Across activity, sleep, and affect anomalies, we generated 14,922 explanations using three LLM families -- Llama, Qwen, and GPT -- under two prompting conditions: one minimally constrained prompt and another prompt explicitly instructing models to bound claims to the data. For each scenario, we varied the amount of behavioral evidence available to the model to examine whether more evidence reduces EO. We evaluated each explanation using a structured rubric, decomposing EO into the dimensions of unsupported causal attribution, unacknowledged data gaps, overconfident language, temporal inconsistency, and diagnostic inference. We find that LLMs routinely attribute anomalous days to causes without sufficient support from the data, and that this pattern replicates across datasets, anomaly types, and model families. Further, providing richer context does not reliably reduce EO; bounded prompting helps but does not eliminate it. These findings suggest that evidential grounding should be a first-order evaluation criterion for LLM-generated personal sensing explanations, alongside fluency and plausibility. We argue that personal sensing explanations require evidential discipline: systems must distinguish what is observed, what is inferred, and what remains unknown.

URL PDF HTML ☆

赞 0 踩 0

2605.08580 2026-05-12 cs.MA cs.AI

Slipstream: Trajectory-Grounded Compaction Validation for Long-Horizon Agents

Zhuofu Chen, Rui Pan, Yinwei Dai, Ravi Netravali

AI总结为了解决长时域大语言模型代理生成的大量上下文带来的问题，研究提出了一种异步压缩方法，通过在原始上下文上并行运行压缩器和代理执行，生成独立于压缩摘要的验证信号，从而填补结构验证缺口。该研究构建了Slipstream系统，利用一个判断器验证候选摘要是否保留了代理的前进意图和关键事实与约束，有效提升了任务准确率并降低了端到端延迟。

Comments 9 pages (16 pages counting references, appendix), 6 figures, 2 tables

2605.08561 2026-05-12 stat.ML cs.LG

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

Zhenhan Fang, Aixin Tan, Jian Huang

AI总结本文提出了一种名为CONTRA的新方法，用于生成多维输出的可靠预测区域。该方法通过归一化流的潜在空间定义非一致性评分，从而克服传统方法在高维空间中预测区域模糊的问题。CONTRA不仅能够生成更精确的预测区域，还支持与现有预测模型结合使用，提升其预测可靠性，适用于多种数据集，具有广泛的适用性。

Comments 18 pages, 7 figures and 5 tables

2605.08559 2026-05-12 math.FA cs.LG cs.NA cs.NE math.NA math.OC

Structure-Preserving Reconstruction of Convex Lipschitz Functionals on Hilbert Spaces from Finite Samples

Anastasis Kratsios

AI总结该论文研究了如何从有限样本点重建定义在可分希尔伯特空间上的凸Lipschitz泛函的问题。作者提出了一种显式的有限样本重建方法，能够在保持凸性和Lipschitz性质的同时，达到任意给定的精度。该方法仅需有限个线性测量，并可通过ReLU神经网络实现，进而引入了凸神经泛函（CNF）这一结构化可训练模型，为从有限数据中学习凸泛函提供了理论基础。

2605.08553 2026-05-12 cs.SE cs.AI cs.LG

VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation

Zichen Xie, Mrigank Pawagi, Yuxin Liu, Aaditi Rai, Lize Shao, John Berberian, Sicong Che, Wenxi Wang

AI总结 VeriContest 是一个用于可验证代码生成的编程竞赛基准，包含来自 LeetCode 和 Codeforces 的 946 道题目，涵盖 Rust 语言与 Verus 验证工具。每个题目均配有自然语言描述、专家验证的形式化规范、经过评测的代码、形式化证明以及测试用例，支持对规范生成、代码生成、证明生成及端到端验证的独立评估。该基准通过三阶段流程构建，结合人工验证与半自动化扩展，并引入测试作为质量保障层，揭示了当前模型在可验证代码生成方面与普通代码生成之间存在显著差距，为未来研究提供了严格的评估平台。

详情

英文摘要

Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation offers a path beyond testing by requiring models to produce not only executable code, but also formal specifications and machine-checkable proofs. Progress in this direction, however, is difficult to measure: existing benchmarks are often small, focus on only one part of the pipeline, lack ground-truth proofs or rigorous specification validation, or target verification settings far from mainstream software development. We present VeriContest, a benchmark of 946 competitive-programming problems from LeetCode and Codeforces for verifiable code generation in Rust with Verus. Each problem pairs a natural language description with expert-validated formal specifications, judge-accepted Rust code, Verus-checked proofs, and positive and negative test suites. VeriContest is constructed through a three-phase pipeline that scales from manually verified seed problems to semi-automated expansion with human-in-the-loop review. To further strengthen benchmark quality, we use testing as an additional quality-assurance layer for validating postcondition completeness. VeriContest supports isolated and compositional evaluation of specification generation, code generation, proof generation, and end-to-end verified program synthesis. Evaluating ten state-of-the-art models reveals a sharp gap between coding ability and verifiable code generation: the strongest model reaches 92.18% on natural-language-to-code generation, but only 48.31% on specification generation, 13.95% on proof generation, and 5.29% end-to-end. These results identify proof and specification generation as the central bottlenecks for models and establish VeriContest as a rigorous platform for measuring and training future systems that generate code with machine-checkable correctness.

URL PDF HTML ☆

赞 0 踩 0

2605.08552 2026-05-12 stat.ML cs.LG

Learnability and Competition in High-Dimensional Multi-Component ICA

Eser Ilke Genc, Samet Demir, Zafer Dogan

AI总结本文研究了高维多分量独立成分分析（ICA）中的可学习性与竞争机制，提出了一个渐近精确的平均场理论，揭示了在线学习过程中估计方向与真实成分之间的耦合关系。研究发现，在高维极限下，估计值与真实成分的重叠矩阵满足一个闭合的常微分方程系统，并据此发现了由初始化驱动的两种相态：解耦态和竞争态。该理论给出了学习率、数据矩和初始化之间的显式可学习边界与竞争条件，并通过实验验证了理论预测的轨迹和相变行为。

Comments 56 pages, 9 figures

2605.08546 2026-05-12 stat.ML cs.LG math.OC

Sliced Inner Product Gromov-Wasserstein Distances

Xiaoyun Gong, Gabriel Rioux, Ziv Goldfeld

AI总结本文研究了高维数据下内积成本的格罗莫夫-瓦瑟斯坦（IGW）距离的可扩展性问题，提出了一种具有自然旋转不变性质的切片IGW距离，解决了其在一维情况下缺乏闭式解的难题。该方法在理论分析和数值实验中得到了验证，并应用于文本数据的异构聚类和语言模型表示比较任务中。

Comments 49 pages, 8 figures

2605.08528 2026-05-12 cs.MA cs.RO

SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics

Yicheng Zhu, Yang Chen, Tao Li, Zilin Bian

AI总结本文提出了一种名为 SceneFactory 的 GPU 向量化自动驾驶模拟平台，能够在保持物理真实性的前提下实现高效的多智能体仿真。该平台基于 NVIDIA Isaac Sim 和 Isaac Lab 构建，通过将世界和智能体表示为批量张量，实现了在 GPU 上的并行控制、观测、奖励计算和策略推理。实验表明，SceneFactory 在相同硬件条件下相比非向量化方案的仿真吞吐量提升了 127 倍，并在湿滑路面等复杂条件下展示了物理感知策略的有效性。

详情

英文摘要

Autonomous-driving simulators typically trade physical fidelity for scalable parallelism. Physics-based platforms such as CARLA and MetaDrive provide articulated vehicle dynamics and contact, but their non-vectorized interfaces make batched training difficult. GPU-batched systems such as Waymax and GPUDrive scale to hundreds of scenarios by replacing rigid-body physics with simplified kinematic models, omitting tire--road interaction, suspension, contact dynamics, and road-condition-dependent friction. We introduce SceneFactory, a GPU-vectorized platform for procedural scene construction, physics-based multi-agent simulation, and RL in autonomous-driving environments. Built on NVIDIA Isaac Sim + Isaac Lab, SceneFactory represents worlds and agents as batched tensors: control, observations, rewards, resets, and policy inference run as GPU tensor operations over the Isaac Lab tensor API. SceneFactory converts Waymo Open Motion Dataset road topologies into simulation-ready USD worlds, runs many worlds concurrently on one GPU, populates each with multiple articulated PhysX vehicles, and maps precipitation and road-surface type to PhysX material friction coefficients. With GPU vectorization, SceneFactory achieves up to 127$\times$ higher throughput than a non-vectorized PhysX baseline on the same GPU and physics solver, reaching 19,250 controlled-agent simulation steps per second at 256 worlds $\times$ 16 agents. Cross-simulator transfer reveals an asymmetric dynamics gap: physics-grounded RL policies transfer to a simplified kinematic bicycle model with 99.5% success, whereas reverse transfer drops to 47.3%. Under wet-road friction, friction-aware policies reduce mean peak DRAC from 58.7 to 27.8,m/s$^2$ without sacrificing goal reach. SceneFactory shows that scalable autonomous-driving training need not discard articulated rigid-body dynamics or physically grounded road-condition variation.

URL PDF HTML ☆

赞 0 踩 0

2605.08527 2026-05-12 cs.DC cs.AI

MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service

Timothy Tin Long Yu, Gursimran Singh, Ge Shi, Hanieh Sadri, Yong Zhang, Zhenan Fan

AI总结 MARLaaS 是一个面向多用户的异步强化学习即服务系统，旨在降低大语言模型微调的计算成本并提升效率。该系统通过共享基础模型并采用轻量级 LoRA 适配器，结合分阶段异步架构，实现了多个任务的并发训练。其设计有效减少了任务间的干扰和空闲时间，显著提升了硬件利用率并缩短了端到端训练时间。

2605.08499 2026-05-12 cs.IR cs.AI

Multi-Level Graph Attention Network Contrastive Learning for Knowledge-Aware Recommendation

Zhifei Hu, Feng Xia

AI总结本文针对知识图谱增强推荐系统中标签稀疏、图结构学习不足和知识实体噪声等问题，提出了一种多视角图对比学习框架。该方法通过多视角知识图谱蒸馏增强用户表示，结合邻居实体信息构建更具信息量的物品表示，并设计了多级自监督对比学习模块，从跨层、层内和交互三个层面进行对比学习，提升模型的泛化能力和区分能力。实验结果表明，该框架在多个公开数据集上优于现有先进方法，验证了其有效性。

2605.08488 2026-05-12 math.OC cs.LG

A Unified Lyapunov-IQC Framework for Uniform Stability of Smooth Quadratic First-Order Accelerated Optimizers

Don Li, Dacian Daescu

AI总结本文提出了一种统一的李雅普诺夫-积分二次约束（IQC）框架，用于分析光滑二次目标函数下一阶加速优化算法的均匀稳定性。该方法通过引入李雅普诺夫函数和IQC不等式，将优化算法的动力学建模为线性系统与梯度算子的反馈互联结构，从而将稳定性分析转化为一个可由半定规划求解的线性矩阵不等式可行性问题。该框架不仅适用于Nesterov加速梯度法，还为优化动力学与鲁棒控制理论之间的结构联系提供了新的视角，并为复杂优化算法的稳定性验证提供了模块化的方法。

2605.08485 2026-05-12 stat.ML cs.LG math.ST stat.ME stat.TH

Sinkhorn Treatment Effects: A Causal Optimal Transport Measure

Medha Agarwal, Alex Luedtke

AI总结本文提出了一种名为Sinkhorn处理效应的因果最优运输度量，用于衡量反事实分布之间的差异。该方法基于熵正则化的最优运输理论，能够捕捉整个分布层面的差异，而不仅仅是平均处理效应。通过将其表示为反事实均值嵌入的平滑变换，作者建立了该度量的路径可微性，并构造了去偏估计量，从而提出了用于检验分布处理效应的渐近有效检验方法。实验表明该方法在模拟和图像数据中具有良好的实际效果。

Comments 55 pages, 6 figures

2605.08460 2026-05-12 cs.CR cs.AI

When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks

Ziwen Cai, Yihe Zhang, Xiali Hei

AI总结本文研究了多智能体网络中子代理（subagent）生成机制可能带来的安全风险，特别是当父代理被攻击后，其继承的内存可能将恶意指令、过时状态或非预期行为规则传递给新生成的子代理，从而导致攻击范围扩散。作者通过分析当前主流框架中的不安全内存继承、弱资源控制等问题，揭示了继承机制对多智能体系统安全性的关键影响，并提出了基于显式安全不变量的防御方法。

2605.08456 2026-05-12 cs.CR cs.LG

HEART: A High-Efficiency Adaptive Real-Time Telemonitoring Framework for Secure Electrocardiogram Signal Transmission Using Chaotic Encryption

Beyazıt Bestami Yuksel

AI总结本文提出了一种高效自适应的实时远程心电图（ECG）监测框架HEART，通过利用患者自身ECG信号特征生成可学习的加密密钥，实现对ECG信号的实时加密传输，保障数据隐私与诊断准确性。该方法采用混沌加密技术，结合动态密钥生成和生物特征刷新机制，有效提升了系统的安全性与抗攻击能力。实验结果表明，该框架在保证低加密延迟的同时，实现了高保真度的信号重建，具有良好的实时性能和诊断可靠性。

Comments 15 pages, 4 figure, 3 table

详情

DOI: 10.5152/electrica.2026.25232
Journal ref: ELECTRICA 2026

英文摘要

The realtime analysis and secure transmission of electrocardiogram ECG signals are critical for accurate diagnosis and safeguarding patient privacy in telemedicine applications This study presents a novel realtime ECG monitoring system that employs a learnable key generator LKG derived from each patients own ECG signal characteristics to dynamically produce unique encryption keys These keys determine the parameters r and x0 of a logistic map used for chaotic encryption The system securely encrypts realtime ECG data immediately after acquisition ensuring confidential transmission and storage in the cloud For remote clinical access the encrypted data is downloaded and decrypted on the doctors side using the matching key generated at the source or securely stored in the cloud This approach eliminates the need for traditional key exchange and substantially raises the cost of exhaustive key search in practice through persegment biometric key refresh and combined permutation and XOR diffusion supported by minentropy evaluation Compared to statickey methods the learnable biometric key design offers greater unpredictability and individualization A comprehensive set of security assessments including Shannon entropy 7678 bits correlation and autocorrelation disruption histogram statistics NIST SP 80022 frequency testing plaintextkey sensitivity avalanche effect FFTbased spectral flatness and robustness to noise and occlusion confirms the methods strength Reconstruction fidelity MSE approximately 5x106 PSNR greater than 52 dB MAE approximately 0002 demonstrates nearlossless decryption and preserved diagnostic features Encryption latency remains low preserving realtime performance.

URL PDF HTML ☆

赞 0 踩 0

2605.05682 2026-05-12 cs.HC cs.AI cs.CY

PersonaTeaming: Supporting Persona-Driven Red-Teaming for Generative AI

Wesley Hanwen Deng, Mingxi Yan, Sunnie S. Y. Kim, Akshita Jha, Lauren Wilcox, Kenneth Holstein, Motahhare Eslami, Leon A. Gatys

AI总结该研究提出了一种基于角色驱动的红队测试方法（PersonaTeaming），旨在提升生成式AI的安全性评估，通过引入不同角色视角来丰富对抗性攻击策略。研究设计了PersonaTeaming Workflow，将角色信息融入对抗提示生成过程，相比现有方法在攻击成功率和提示多样性上表现更优。为进一步促进人机协作，研究还开发了PersonaTeaming Playground交互界面，支持红队人员自定义角色并与AI协作优化攻击提示，实验表明该方法有效激发了多样化的攻击策略并提升了红队人员的创造力。

2605.02416 2026-05-12 cs.IT cs.LG math.IT

Dueling DDQN-Based Adaptive Multi-Objective Handover Optimization for LEO Satellite Networks

Po-Heng Chou, Chiapin Wang, Chung-Chi Huang, Kuan-Hao Chen

AI总结本文提出了一种基于双深度Q网络（DDQN）的多目标切换优化框架，用于低轨卫星网络，旨在动态平衡吞吐量、阻塞概率和切换成本。该方法通过引入竞争机制增强学习效果，能够适应时变网络环境。仿真结果表明，该方法在典型运行条件下相比传统方法具有更高的吞吐量和更低的阻塞率，性能提升显著。

Comments 6 pages, 5 figures, 1 table, and submitted to 2026 IEEE Globecom

2605.01708 2026-05-12 cs.DC cs.AI cs.LG

SplitZip: Ultra Fast Lossless KV Compression for Disaggregated LLM Serving

Yipin Guo, Siddharth Joshi

AI总结在大规模语言模型（LLM）服务系统中，预填充（prefill）和解码（decode）阶段的分离导致了KV缓存传输成为性能瓶颈，尤其在长输入和智能体工作负载下更为明显。为了解决这一问题，本文提出SplitZip，一种专为GPU优化的无损KV缓存压缩方法，通过利用浮点数指数中的冗余信息，结合固定长度编码和稀疏逃逸流，实现了高效压缩与解压。实验表明，SplitZip在BF16和FP8格式下均显著提升了KV缓存传输效率，有效加速了端到端的模型服务过程。

详情

英文摘要

Contemporary systems serving large language models (LLMs) have adopted prefill-decode disaggregation to better load-balance between the compute-bound prefill phase and the memory-bound decode phase. Under this design, prefill workers generate a KV cache that must be transferred to decode workers before token generation can begin. With these workers residing on different physical systems, this transfer becomes a significant bottleneck to serving LLMs at scale. This bottleneck gets exacerbated for long-input and agentic workloads. Existing lossless codecs are not suited to this setting as they primarily target offline weight compression, run on the CPU, or use variable-length coding whose decompression is fast but compression is too slow to keep up with KV production during prefill. We introduce SplitZip, a GPU-friendly lossless compressor for KV cache transfer that preserves KV tensors bitwise and integrates into existing serving frameworks without changes to model execution. SplitZip exploits redundancy in floating-point exponents of KV activations, encoding the most frequent exponent values with fixed-length codes and routing rare exponents through a sparse escape stream of (position, value). An offline calibrated top-16 exponent codebook eliminates online-histogramming, while the regular dense path and sparse escape correction make both encoding and decoding efficient on GPUs. On real BF16 activation tensors, SplitZip achieves $613.3$ GB/s compression throughput and $2181.8$ GB/s decompression throughput, substantially outperforming prior lossless compressors on the latency-critical codec path. End-to-end transfer experiments show up to $1.32\times$ speedup for BF16 KV cache transfer, $1.30\times$ speedup for TTFT, and $1.23\times$ increase on Request Throughput. The same approach extends to FP8 KV caches, providing up to $1.14\times$ compression over native E5M2.

URL PDF HTML ☆

赞 0 踩 0

2605.01401 2026-05-12 cs.HC cs.AI

AI Expert Twin: Capturing Expert Cognition for Human-Centred, Practice-Based Learning

Annie Yuan, Xiaohua Chen, Kalina Yacef, Judy Kay

AI总结本文提出了一种名为AI Expert Twin的认知为中心的框架，旨在捕捉专家实践中的隐性知识，包括程序性操作、语义概念和决策过程，并考虑价值偏好、权衡和不确定性对专家判断的影响。该框架通过三层结构化表示形式化专家认知，并在文化遗产工作坊的案例研究中验证了其可行性，展示了其在职业教育和创意产业等领域的可迁移性。该方法为构建透明、以学习者为中心的AI教育系统提供了新路径。

Comments 8 pages, 3 figures