arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.14305 2026-05-15 cs.CL

Factorization-Error-Free Discrete Diffusion Language Model via Speculative Decoding

Xun Fang, Yunchen Li, Hang Yuan, Zhou Yu

AI总结本文提出了一种无因子化误差的离散扩散语言模型（FeF-DLLM），旨在解决传统方法中因独立预测清洁令牌而导致的因子化误差问题。该方法通过精确的前缀条件因子化替代独立预测，更有效地保留令牌间的依赖关系，并结合推测解码技术，在保持并行预测能力的同时提升推理速度。实验表明，该方法在多个基准数据集上平均提升了5.04个百分点的准确性，同时实现了3.86倍的加速。

2605.14304 2026-05-15 cs.LG cs.AI

Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry

Zuyuan Zhang, Carlee Joe-Wong, Tian Lan

AI总结该研究提出了一种名为矩阵空间强化学习（MSRL）的新方法，旨在通过复用已有轨迹片段中的局部转移几何结构，提升强化学习中的组合泛化能力。MSRL 使用正定矩阵描述符来捕捉轨迹片段的一阶和二阶统计特性，从而在抽象的矩阵空间中实现代数组合与知识迁移。实验表明，该方法在有限预算下取得了优于现有方法的性能，展示了其在跨任务学习中的有效性。

2605.14301 2026-05-15 cs.LG stat.ML

Language-Induced Priors for Domain Adaptation

Qiyuan Chen, Jiayu Zhou, Raed Al Kontar

AI总结在领域适应中，当目标域数据稀缺时，传统统计方法难以区分相关与不相关的源域，导致负迁移。本文提出利用目标域的专家文本描述，构建语言诱导先验（LIP），将其与期望最大化算法结合，以识别相关源域。该方法兼容多种参数模型，能够在目标信号弱时引导源域选择，并随着数据积累逐步优化，理论分析表明其在正确先验下具有接近理想冷启动性能，并保持渐近一致性。实验验证了该框架在估计、预测和决策任务中的有效性。

2605.14297 2026-05-15 cs.LG cs.AI math.OC stat.ML

Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients

Matias Alvo, Daniel Russo, Yash Kanoria

AI总结本文研究了在混合离散-连续动作空间中的强化学习问题，这类问题常见于机器人控制和优化领域。为了解决传统策略梯度方法在高维空间中梯度质量差的问题，作者提出了混合策略优化（HPO）方法，通过结合路径梯度和得分函数梯度，实现无偏混合梯度估计，从而有效应对离散动作和非光滑动态带来的挑战。实验表明，HPO在库存控制和切换线性二次调节器等任务中显著优于PPO算法，且在连续动作维度增加时优势更加明显。

2605.14294 2026-05-15 cs.AI cs.LG

Precise Verification of Transformers through ReLU-Catalyzed Abstraction Refinement

Hengjie Liu, Zhenya Zhang, Jianjun Zhao

AI总结随着Transformer模型在安全关键领域的广泛应用，其形式化验证变得尤为重要。与传统神经网络相比，Transformer的推理过程涉及复杂的计算，如自注意力层中的点积操作，使得验证极具挑战性。本文提出了一种基于ReLU催化的抽象细化方法，通过精确表示点积的非线性边界，结合凸松弛技术，提升了验证精度，并在两种经典验证方法的基础上扩展出适用于Transformer的高效且精确的验证框架，实验表明该方法在保持较高效率的同时显著提升了验证精度。

Comments 32 pages, 6 figures, the full version of the paper accepted by CAV 2026

2605.14289 2026-05-15 cs.LG cs.AI cs.CL cs.CR

MetaMoE: Diversity-Aware Proxy Selection for Privacy-Preserving Mixture-of-Experts Unification

Weisen Jiang, Shuhao Chen, Sinno Jialin Pan

AI总结本文提出了一种隐私保护的混合专家（MoE）统一框架MetaMoE，旨在解决分布式数据环境下专家模型无法共享训练数据的问题。该方法通过选择与客户端领域相关且多样化的公共代理数据，替代无法获取的私有数据，从而有效指导路由器学习并提升专家协调能力。实验表明，MetaMoE在计算机视觉和自然语言处理任务中优于现有的隐私保护MoE统一方法。

Comments Accepted by ICML 2026

2605.14280 2026-05-15 cs.LG stat.ML

TILT: Target-induced loss tilting under covariate shift

Kakei Yamamoto, Martin J. Wainwright

AI总结本文提出了一种名为TILT的无监督域适应方法，用于处理协变量偏移问题。该方法通过引入一个新颖的目标函数，将源域预测器分解为两个部分，并在有标签的源域数据上拟合这两个部分，同时在无标签的目标域数据上对辅助部分施加惩罚，最终得到的主预测器用于目标域预测。理论分析表明，该方法在总体层面能够隐式地诱导相对重要性加权，并且具有良好的稳定性与泛化能力。实验结果表明，TILT在多个任务中优于仅使用源域训练、精确重要性加权以及相对密度比等基线方法。

Comments 32 pages, 17 figures. Submitted to NeurIPS 2026

2605.14278 2026-05-15 cs.CV

KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration

Ruicheng Zhang, Kaixi Cong, Jun Zhou, Zhizhou Zhong, Zunnan Xu, Shuiyang Mao, Wei Liu, Xiu Li

AI总结本文提出了一种名为KVPO的ODE原生在线组相对策略优化框架，用于通过键值语义探索对流式自回归视频生成器进行对齐。该方法通过将多样性探索的来源从随机噪声转移到历史键值缓存，构建语义多样且保持数据流形的生成分支，从而提升长期一致性。同时，KVPO引入基于轨迹速度能量的替代策略，实现了与ODE原生形式完全一致的奖励加权对比目标，在多个实验设置中显著提升了视频的视觉质量、运动质量和文本-视频对齐效果。

2605.14277 2026-05-15 cs.AI cs.GT

Parallelizing Counterfactual Regret Minimization

Juho Kim, Tuomas Sandholm

AI总结本文研究了如何将反事实遗憾最小化（CFR）算法并行化，以加速求解大规模不完美信息博弈。作者将CFR重新表述为一系列线性代数操作，从而能够利用现有的并行计算技术提升其效率。该方法适用于多种CFR变体，如CFR+、折扣CFR和预测型CFR。实验表明，基于GPU的实现比CPU上的现有实现快达四千倍。

Comments This paper contains and extends ideas that were originally in arxiv:2408.14778

2605.14274 2026-05-15 cs.CV

CreFlow: Corrective Reflow for Sparse-Reward Embodied Video Diffusion RL

Zhenyang Ni, Yijiang Li, Ruochen Jiao, Simon Sinong Zhan, Sipeng Chen, Zhenfei Yin, Minshuo Chen, Philip Torr, Zhaoran Wang, Qi Zhu

AI总结该论文提出了一种名为CreFlow的在线强化学习框架，用于改进稀疏奖励下的具身视频生成模型。研究针对现有视频强化学习奖励机制无法准确反映任务逻辑的问题，引入了基于组合逻辑约束的奖励模型，将任务要求转化为线性时序逻辑约束，从而提供更准确的奖励信号和局部错误信息。CreFlow通过两个关键设计——信用感知的NFT损失和校正重流损失，有效提升了高维视频生成的训练效率与稳定性，实验表明其在双臂操作任务中的执行成功率提升了23.8个百分点。

2605.14269 2026-05-15 cs.CV cs.AI

PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation

Yidong Huang, Zun Wang, Han Lin, Dong-Ki Kim, Shayegan Omidshafiei, Jaehong Yoon, Jaemin Cho, Yue Zhang, Mohit Bansal

AI总结生成真实的人类运动是视频生成中的核心挑战之一。为了解决现有奖励信号无法准确评估运动真实性的难题，本文提出PhyMotion，一种基于物理模拟的结构化运动奖励机制，通过评估运动的运动学合理性、接触与平衡一致性以及动力学可行性等多个维度，实现对生成视频中人体运动质量的精细评价。实验表明，PhyMotion相比现有方法能更准确地反映人类判断，并在基于强化学习的后训练中显著提升了运动真实性和生成质量。

Comments First two authors contributed equally, website: https://phy-motion.github.io/

详情

英文摘要

Generating realistic human motion is a central yet unsolved challenge in video generation. While reinforcement learning (RL)-based post-training has driven recent gains in general video quality, extending it to human motion remains bottlenecked by a reward signal that cannot reliably score motion realism. Existing video rewards primarily rely on 2D perceptual signals, without explicitly modeling the 3D body state, contact, and dynamics underlying articulated human motion, and often assign high scores to videos with floating bodies or physically implausible movements. To address this, we propose PhyMotion, a structured, fine-grained motion reward that grounds recovered 3D human trajectories in a physics simulator and evaluates motion quality along multiple dimensions of physical feasibility. Concretely, we recover SMPL body meshes from generated videos, retarget them onto a humanoid in the MuJoCo physics simulator, and evaluate the resulting motion along three axes: kinematic plausibility, contact and balance consistency, and dynamic feasibility. Each component provides a continuous and interpretable signal tied to a specific aspect of motion quality, allowing the reward to capture which aspects of motion are physically correct or violated. Experiments show that PhyMotion achieves stronger correlation with human judgments than existing reward formulations. These gains carry over to RL-based post-training, where optimizing PhyMotion leads to larger and more consistent improvements than optimizing existing rewards, improving motion realism across both autoregressive and bidirectional video generators under both automatic metrics and blind human evaluation (+68 Elo gain). Ablations show that the three axes provide complementary supervision signals, while the reward preserves overall video generation quality with only modest training overhead.

URL PDF HTML ☆

赞 0 踩 0

2605.14267 2026-05-15 cs.CV cs.AI

Image Restoration via Diffusion Models with Dynamic Resolution

Yang Zheng, Wen Li, Zhaoqiang Liu

AI总结该研究针对扩散模型在图像修复任务中计算开销大的问题，提出了一种基于动态分辨率扩散模型的图像修复方法。通过将数据投影到低维子空间，有效降低了计算负担，并在原有像素空间方法的基础上改进，提出了SubDPS和SubDAPS两种新方法，其中SubDAPS++进一步提升了修复效率和质量。实验表明，该方法在多个数据集和任务上优于现有基于扩散模型的图像修复方法。

Comments Accepted by ICML 2026

2605.14266 2026-05-15 cs.AI cs.CY

Agentic AI Ecosystems in Higher Education: A Perspective on AI Agents to Emerging Inclusive, Agentic Multi-Agent AI Framework for Learning, Teaching and Institutional Intelligence

Vidya K Sudarshan, Anushka Sisodia, Reshma A Ramachandra, Sia Batra, Josephine Chong Leng Leng

AI总结本文探讨了人工智能代理在高等教育中的应用前景，提出构建一个集成化的多智能体AI框架，以支持教学、学习和机构管理的协同运作。当前AI工具多为单一任务导向且缺乏整合，难以满足教育生态系统复杂需求，本文通过文献分析指出现有研究在跨功能整合与包容性设计方面的不足，并强调构建协调、适应性强的多智能体系统对于实现公平、包容教育的重要意义。

Comments 50 pages, 14 figures, 3 tables

详情

英文摘要

Integration of artificial intelligent (AI) agents in higher education is transforming teaching, learning and administrative processes. Although existing AI agents effectively support individual tasks, their implementation remains fragmented and inefficient for handling the complexity of educational institutions. This highlights a significant research gap: the lack of integrated eco-system-level agentic multi-agent AI platform capable of coordinated planning, reasoning, and adaptive decision-making across multiple educational functions. This paper presents a forward-looking perspective on agentic multi-agent AI platform in higher education, consisting interconnected autonomous, goal driven agents that support learning, teaching, and institutional operations. It addresses timely and critical questions: Can agentic AI represent the next generation of intelligent systems in tertiary education? Can they collectively support seamless coordinated operations across teaching, learning and administrative support? To what extent can such systems foster inclusive and equitable learning for diverse learners with special educational needs? To ground this perspective, a thematic analysis of existing literature identifies four dominant themes: task-specific fragmented AI tools, the transition from single-agent to multi-agent systems, limited cross-functional integration, and insufficient focus on inclusivity and accessibility. Findings reveal a clear gap between current AI implementations and the needs of holistic, learner-centered educational ecosystem. The paper synthesizes challenges and outlines future research directions for scalable human-aligned, and inclusive agentic AI platform. The significant contribution is the incorporation of inclusive learning perspectives, highlighting how coordinated agentic multi-agent platform can support diverse learners through adaptive, multimodal interventions.

URL PDF HTML ☆

赞 0 踩 0

2605.14262 2026-05-15 cs.RO cs.HC

Distill: Uncovering the True Intent behind Human-Robot Communication

Ting Li, David Porfirio

AI总结随着机器人越来越多地融入日常生活，自然语言和用户端编程等直观的沟通方式成为指定机器人自主行为的重要手段。然而，这些方法难以准确捕捉用户的真正意图。为此，本文提出了一种名为Distill的通信方法，通过去除冗余步骤、概括单个步骤的含义以及放宽步骤间的顺序约束，有效提炼和优化用户的初始任务描述，从而更准确地理解用户的真实需求。

Comments 17 pages

2605.14261 2026-05-15 cs.AI cs.GT

Heuristic Pathologies and Further Variance Reduction via Uncertainty Propagation in the AIVAT Family of Techniques

Juho Kim, Tuomas Sandholm

AI总结本文研究了在多智能体环境中如何在样本量有限或试验成本高昂的情况下评估智能体的性能，提出了AIVAT方法族以降低估计方差。文章指出，AIVAT中的启发式价值函数选择和不确定性处理缺乏指导，进而揭示了该方法在梯度下降应用下的潜在问题，并提出应在观察评估数据前固定启发式函数。此外，作者展示了如何传播启发式不确定性以进一步降低方差，尽管这可能牺牲无偏性。实验表明，该方法在扑克数据集上有效减少了达到统计结论所需的样本数量。

2605.14258 2026-05-15 cs.LG cs.AI

Dynamics of the Transformer Residual Stream: Coupling Spectral Geometry to Network Topology

Jesseba Fernando, Grigori Guitchounts

AI总结本文研究了大型语言模型中残差流的动态特性，揭示了训练过程中谱几何与网络拓扑之间的耦合关系。通过全雅可比矩阵的特征分解，作者发现训练使得模型深度方向上形成单调的谱梯度，并伴随着维度压缩现象，这些特性是学习得到的而非由模型结构决定。研究进一步表明，网络中图社区的拓扑位置决定了雅可比矩阵对其扰动的放大或抑制作用，这一关系在模型初始化时并不存在。

2605.14253 2026-05-15 cs.CV cs.LG

Towards Real-Time Autonomous Navigation: Transformer-Based Catheter Tip Tracking in Fluoroscopy

Harry Robertshaw, Yanghe Hao, Weiyuan Deng, Benjamin Jackson, S. M. Hadi Sadati, Nikola Fischer, Tom Vercauteren, Alejandro Granados, Thomas C. Booth

AI总结本文旨在开发一种基于荧光透视图像的实时导管尖端跟踪系统，以支持基于强化学习的自主机械取栓手术导航。研究提出了一种多线程处理框架，结合深度学习分割模型与后处理算法，有效应对图像对比度低、噪声大及设备遮挡等挑战。实验表明，该方法在分割精度上优于现有方法，为未来自主导航系统的实现提供了可靠高效的解决方案。

Comments Harry Robertshaw and Yanghe Hao contributed equally to this work. Published in the International Journal of Computer Assisted Radiology and Surgery

详情

DOI: 10.1007/s11548-026-03647-7
Journal ref: Int J CARS (2026)

英文摘要

Purpose: Mechanical thrombectomy (MT) improves stroke outcomes, but is limited by a lack of local treatment access. Widespread distribution of reinforcement learning (RL)-based robotic systems can be used to alleviate this challenge through autonomous navigation, but current RL methods require live device tip coordinate tracking to function. This paper aims to develop and evaluate a real-time catheter tip tracking pipeline under fluoroscopy, addressing challenges such as low contrast, noise, and device occlusion. Methods: A multi-threaded pipeline was designed, incorporating frame reading, preprocessing, inference, and post-processing. Deep learning segmentation models, including U-Net, U-Net+Transformer, and SegFormer, were trained and benchmarked using two-class and three-class formulations. Post-processing involved two-step component filtering, one-pixel medial skeletonization, and greedy arc-length path following with contour fall-back. Results: On manually-labeled moderate complexity fluoroscopic video data, the two-class SegFormer achieved a mean absolute error of 4.44 mm, outperforming U-Net (4.60 mm), U-Net+Transformer (6.20 mm) and all three-class models (5.19-7.74 mm). On segmentation benchmarks, the system exceeded state-of-the-art CathAction results with improvements of up to +5% in Dice scores for three-segmentation. Conclusion: The results demonstrate that the proposed multi-threaded tracking framework maintains stable performance under challenging imaging conditions, outperforming prior benchmarks, while providing a reliable and efficient foundation for RL-based autonomous MT navigation.

URL PDF HTML ☆

赞 0 踩 0

2605.14252 2026-05-15 cs.LG cs.AI

Not All Timesteps Matter Equally: Selective Alignment Knowledge Distillation for Spiking Neural Networks

Kai Sun, Peibo Duan, Yongsheng Huang, Guowei Zhang, Benjamin Smith, Nanxu Gong, Levin Kuhlmann

AI总结本文研究了脉冲神经网络（SNN）与人工神经网络（ANN）之间的性能差距问题，提出了一种新的知识蒸馏方法——选择性对齐知识蒸馏（SeAl-KD）。该方法突破了传统方法对所有时间步进行统一对齐的假设，通过识别错误时间步并针对性地进行校正，同时保留有用的时序动态，从而更有效地提升SNN的性能。实验表明，该方法在静态图像和神经形态事件数据集上均优于现有蒸馏方法。

2605.14251 2026-05-15 cs.CV

Generative Deep Learning for Computational Destaining and Restaining of Unregistered Digital Pathology Images

Aarushi Kulkarni, Alarice Lowe, Pratik Shah

AI总结该研究探讨了基于条件生成对抗网络（cGAN）的数字病理图像去染色与再染色方法，并针对不同机构间未对齐的全切片图像（WSI）进行了评估。为减少领域偏移影响，研究提出了一种预处理流程，包括基于直方图的染色归一化和通道强度校准。实验结果表明，即使在无图像配准的情况下，该方法仍能实现较好的染色还原效果，并在多个指标上优于直接染色方法，验证了预处理对模型性能的重要影响。

详情

英文摘要

Conditional generative adversarial networks (cGANs) have enabled high-fidelity computational staining and destaining of hematoxylin and eosin (H&E) in digital pathology whole-slide images (WSI). However, their ability to generalize to out-of-distribution WSI across institutions without retraining remains insufficiently characterized. Previously developed cGAN models trained on 102 registered prostate core biopsy WSIs from Brigham and Women's Hospital were evaluated on 82 spatially unregistered WSIs acquired at Stanford University. To mitigate domain shift without retraining, a preprocessing pipeline consisting of histogram-based stain normalization for H&E-stained WSIs and channel-wise intensity calibration for unstained WSIs was developed. Because image registration was intentionally omitted for real-world deployment conditions, the reported quantitative results are conservative lower bounds reflecting both model performance and limited spatial alignment. Under these conditions, virtual destaining achieved a Pearson correlation coefficient (PCC) of 0.854, structural similarity index measure (SSIM) of 0.699, and peak signal-to-noise ratio (PSNR) of 18.41 dB. H&E restaining from computationally destained outputs outperformed direct staining from ground-truth unstained inputs across all metrics (PCC: 0.798 vs. 0.715; SSIM: 0.756 vs. 0.718; PSNR: 20.08 vs. 18.51 dB), suggesting that preprocessing quality may be more limiting than model capacity. Qualitative pathological review indicated preservation of benign glandular structures while showing that malignant glands were often rendered with vessel-like morphologies. These findings support the feasibility of applying cGAN-based computational H&E staining and destaining generative models to external WSI datasets using preprocessing-based adaptation alone while defining specific morphological targets for future domain adaptation.

URL PDF HTML ☆

赞 0 踩 0

2605.14249 2026-05-15 cs.LG

EnergyLens: Predictive Energy-Aware Exploration for Multi-GPU LLM Inference Optimization

Zhiye Song, Kyungmi Lee, Eun Kyung Lee, Xin Zhang, Tamar Eilam, Anantha P. Chandrakasan

AI总结本文提出了一种端到端的EnergyLens框架，用于实现面向能效的大型语言模型（LLM）推理优化。该方法通过直观的einsum接口捕捉模型的融合、并行性和计算-通信重叠等特性，并结合负载不平衡感知的MoE建模和多GPU通信能耗模型，有效预测和优化多GPU环境下的能耗。实验表明，EnergyLens在多个模型和配置上实现了较高的能耗预测精度，并揭示了不同配置下显著的能效差异，为分布式推理优化提供了重要指导。

2605.14246 2026-05-15 cs.LG cs.AI cs.SY eess.SY

Action-Conditioned Risk Gating for Safety-Critical Control under Partial Observability

Yushen Liu, Yin-Jen Chen, Ziyi Chen, Tao Wang, Heng Huang, Xugui Zhou, Yanfu Zhang

AI总结该研究针对部分可观测环境下安全关键控制问题，提出了一种基于动作条件风险门控的强化学习方法，用于在不完全观测情况下平衡任务性能与安全风险。方法通过构建有限历史的紧凑代理状态，并学习动作条件的短期安全违规预测，将预测风险用于价值学习中的风险惩罚和决策时的风险门控，从而在保证安全的同时提升控制性能。实验表明，该方法在血糖调节和安全导航等任务中相比传统方法具有更优的奖励-成本平衡和运行效率。

2605.14242 2026-05-15 cs.LG cs.AI

Artificial Intelligence-Assistant Cardiotocography: Unified Model for Signal Reconstruction, Fetal Heart Rate Analysis, and Variability Assessment

Xiaohua Wang, Kai Yu, XuXiao Liang, Liang Wang, Chao Han

AI总结该研究提出了一种基于人工智能的卡iotocography（CTG）模型，用于胎儿心率信号重建、心率分析及变异性评估。该模型通过大规模未标注数据预训练，并结合专家审核数据进行微调，有效提升了信号重建精度和分析可靠性。研究引入了交叠标签（IOL）方法验证胎儿心率，模型在检测关键心率减速和加速方面表现出高灵敏度和特异性，并在临床指标评估中取得了优异的AUC成绩。

2605.14240 2026-05-15 cs.LG

Paraphrasing Attack Resilience of Various AI-Generated Text Detection Methods

Andrii Shportko, Inessa Verbitsky

AI总结本文研究了多种AI生成文本检测方法在面对改写攻击时的鲁棒性，评估了包括微调RoBERTa、Binoculars和文本特征分析等方法及其随机森林集成的效果。研究发现，包含Binoculars的集成方法性能最强，但在攻击下表现下降也最明显，揭示了AI文本检测中性能与鲁棒性之间的矛盾，挑战了当前对先进检测技术可靠性的认知。

Comments NAACL 2025

2605.14239 2026-05-15 cs.CV

Implicit spatial-frequency fusion of hyperspectral and lidar data via kolmogorov-arnold networks

Zekun Long, Judy X. Yang, Jing Wang, Ali Zia, Guanyiman Fu, Jun Zhou

AI总结本文研究了高光谱图像（HSI）与激光雷达（LiDAR）数据的融合问题，旨在提升复杂场景下的分类性能。针对现有方法在建模结构不连续性和光谱特征方面存在的不足，作者提出了一种基于Kolmogorov-Arnold网络（KAN）的隐式频域-几何融合网络（IFGNet），通过可学习的样条函数自适应捕捉高光谱与LiDAR特征之间的高度非线性关系，并在空间和频域引入LiDAR引导的隐式聚合模块，增强几何感知的表示能力。实验表明，IFGNet在多个基准数据集上显著优于现有方法，具有更高的分类精度和效率。

Comments 6 pages, 1 figure, conference

2605.14237 2026-05-15 cs.AI

Good to Go: The LOOP Skill Engine That Hits 99% Success and Slashes Token Usage by 99% via One-Shot Recording and Deterministic Replay

Xiaohua Wang, Kai Yu, XuXiao Liang, Liang Wang, Chao Han

AI总结本文提出了一种名为LOOP SKILL ENGINE的系统，旨在解决AI代理执行重复性任务时的高失败率和高计算成本问题。该系统通过一次性的任务执行记录和确定性回放机制，实现了99%的任务成功率，并将令牌使用量减少了99%。其核心方法是将首次运行中记录的工具调用轨迹转化为参数化的确定性执行计划，后续任务直接回放该计划，无需再次调用大语言模型，从而大幅降低开销并保证执行的可预测性。

Comments 8 pages, 5 tables

2605.14235 2026-05-15 cs.LG cs.MA quant-ph

Quantum Advantage in Multi Agent Reinforcement Learning

Simranjeet Singh Dahia, Claudia Szabo

AI总结本文研究了量子纠缠在多智能体强化学习中的协调优势，提出了一种基于可变量子电路的去中心化量子多智能体强化学习框架。通过在CHSH游戏中验证，纠缠的量子智能体能够接近理论上的Tsirelson极限，展现出明确的量子优势，而无纠缠的量子电路则与经典方法表现相当。实验还表明，特定的纠缠结构对协调性能有显著影响，并在合作导航任务中展示了量子智能体与经典方法相比的性能提升。

Comments 19 pages

2605.14232 2026-05-15 cs.RO

Reactive Planning based Control for Mobile Robots in Obstacle-Cluttered Environments

Li Tan, Junlin Xiong, Yan Wang, Wei Ren

AI总结本文研究了移动机器人在障碍物密集环境中进行避障运动控制的问题。针对机器人仅有部分环境信息的情况，提出了一种基于反应式规划的控制策略（RPCS），通过构建参考轨迹并结合局部轨迹调整实现避障，同时设计了自适应跟踪控制方法以提高轨迹跟踪性能。该方法有效结合了反应式规划与自适应控制，实验结果验证了其优越性与实用性。

Comments 7 pages, 7 figures

2605.14231 2026-05-15 cs.LG cs.AI cs.SD

AudioMosaic: Contrastive Masked Audio Representation Learning

Hanxun Huang, Qizhou Wang, Xingjun Ma, Cihang Xie, Christopher Leckie, Sarah Erfani

AI总结本文提出了一种基于对比学习的音频编码器 AudioMosaic，用于通用音频理解任务。该方法通过结构化时频掩码生成正样本对，降低内存消耗并支持高效的大批量训练。与生成式方法相比，AudioMosaic 能够学习更具判别性的语句级表示，在不同数据集、领域和声学条件下表现出优异的迁移能力，并在多个标准音频基准测试中取得了最先进的性能。

Comments ICML2026

2605.14227 2026-05-15 cs.LG cs.CL

DT-Transformer: A Foundation Model for Disease Trajectory Prediction on a Real-world Health System

Yunying Zhu, Andrew R Weckstein, Kueiyu Joshua Lin, Jie Yang

AI总结该研究提出了一种名为DT-Transformer的基础模型，旨在基于真实世界医疗系统中的电子健康记录（EHR）预测疾病轨迹。模型在麻省总医院布里格姆健康系统（MGB）的170万名患者、5710万条结构化EHR数据上进行训练，能够准确预测多种疾病的未来发展情况。实验表明，该模型在多种疾病类别上的预测性能优异，为基于真实临床数据的疾病预测提供了新的方法和有力支持。

Comments Work in Progress

2605.14221 2026-05-15 cs.CV

Automatic Landmark-Based Segmentation of Human Subcortical Structures in MRI

Ahmed Rekik, R. Jarrett Rushmore, Sylvain Bouix, Linda Marrakchi-Kacem

AI总结本文研究了如何在磁共振成像（MRI）中精确分割人脑皮下结构的问题，提出了一个基于标志点引导的三维脑分割方法。该方法模仿哈佛-牛津图谱的手动分割流程，通过全局到局部网络自动检测16个关键标志点，并结合语义分割模型和标志点驱动的后处理步骤，将12个粗略解剖标签分割为26个独立结构，显著提升了分割边界的一致性和准确性。

Comments 7 pages, 5 figures. Accepted for presentation at the 48th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2026)