arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.06156 2026-05-11 cs.LG cs.AI

Entropy-Regularized Adjoint Matching for Offline Reinforcement Learning

Abdelghani Ghanem, Mounir Ghogho

AI总结本文提出了一种名为最大熵伴随匹配（ME-AM）的统一框架，旨在解决离线强化学习中因固定行为分布导致的流行度偏差和支撑绑定问题。该方法通过引入镜像下降熵最大化目标和混合行为先验，提升了策略的表达能力并扩展了探索范围，从而更有效地从离线数据中提取高回报动作。实验表明，ME-AM在多个稀疏奖励的连续控制环境中表现出优于或接近现有先进方法的性能。

2605.06115 2026-05-11 cs.AI

CrossCult-KIBench: A Benchmark for Cross-Cultural Knowledge Insertion in MLLMs

Zhen Zeng, Leijiang Gu, Feng Li, Jing Yu, Zenglin Shi

AI总结多模态大语言模型（MLLMs）主要基于英语数据训练，因此在跨文化场景中常生成文化不适当或不协调的响应。为此，研究提出了跨文化知识插入任务，旨在使模型适应特定文化背景的同时保持其在其他文化中的原有行为。本文引入了CrossCult-KIBench基准，包含9800个基于图像的跨文化场景案例，用于评估知识插入的效果及其对非目标文化的潜在影响，并提出了一种基于记忆的条件知识插入方法（MCKI）作为基线，实验表明当前方法在文化适应与行为保持之间仍面临挑战，突显了开发更具文化适应性的MLLMs的重要性。

2605.05958 2026-05-11 cs.AI

Temporal Smoothness Doubly Robust Learning for Debiased Knowledge Tracing

Peilin Zhan, Wei Chen, Weilin Chen, Shuyi Pan, Ruichu Cai

AI总结知识追踪（KT）是智能教育系统的核心，但其依赖的选择性观测教育日志会导致严重的选择偏差。为此，本文提出了一种双重稳健（DR）框架，结合倾向模型与误差填补模型，理论上保证了在任一模型准确时的无偏性。此外，针对KT的时序特性，研究进一步引入时间平滑性作为控制估计方差的关键因素，并据此提出时间平滑双重稳健（TSDR）方法，在保持无偏性的同时有效降低方差，实验表明该方法在多个真实数据集上显著提升了现有KT模型的性能。

2605.05949 2026-05-11 cs.AI cs.SE

MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System

Yuliang Xu, Xiang Xu, Yao Wan, Hu Wei, Tong Jia

AI总结本文提出了一种名为MAS-Algorithm的多智能体系统工作流，用于解决算法编程问题。该方法受竞赛编程和算法工程师实践的启发，将问题求解过程分解为模块化阶段，支持结构化推理、工具集成与智能体间灵活协作，具有良好的扩展性和通用性。实验表明，该方法在多个基准测试中显著提升了模型的接受率，并在推理过程分析和组件替换研究中展示了其优越性与潜力。

详情

英文摘要

Algorithmic problem solving serves as a rigorous testbed for evaluating structured reasoning in AI coding systems, as it directly reflects a model's ability to perform structured reasoning in complex scenarios. Existing approaches predominantly rely on model-centric strategies, such as architectural modifications and data scaling, which are costly and offer limited interpretability. Alternative methods leveraging external tools or prompting techniques (e.g., chain-of-thought) are often fragmented and lack a unified framework. In this paper, we propose MAS-Algorithm, a systematic multi-agent workflow for algorithmic problem solving inspired by the practices of competitive programmers and algorithm engineers. Our framework decomposes the end-to-end solving process into modular stages, enabling structured reasoning, tool integration, and flexible coordination among agents. The design emphasizes both rigor and extensibility, allowing it to generalize across diverse problem types. Experimental results on a self-constructed benchmark demonstrate consistent improvements across multiple Qwen series models, achieving an average gain of 6.48% in acceptance rate. In contrast, parameter-efficient fine-tuning on the same data yields only a marginal improvement of 0.89%. We further observe a 4.72% gain on LiveCodeBench-Pro, along with consistent improvements across additional accuracy and efficiency metrics. Beyond performance gains, we conduct comprehensive analyses to better understand the reasoning process within the workflow, including error patterns and cross-scenario behaviors. We further perform customized replacement and ablation studies to explore the upper bound of the framework, showing that individual agents can contribute improvements of up to 27.7%. These results highlight the strong potential of MAS-Algorithm for advancing AI-driven algorithmic reasoning.

URL PDF HTML ☆

赞 0 踩 0

2605.05927 2026-05-11 cs.CL cs.SD eess.AS

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

Wenqian Cui, Xiao-Hui Li, Daxin Tan, Qiyong Zheng, Irwin King

AI总结该论文研究了语音大语言模型（SLM）与文本大语言模型（TLM）之间的模态差距问题，提出从输入端减少这一差距的新方法。作者设计了TextPro-SLM，通过结合统一的语音编码器WhisperPro和经过训练的LLM主干网络，使语音输入更接近具有韵律感知能力的文本模型。实验表明，TextPro-SLM在3B和7B规模下均取得最低的模态差距，并在副语言理解任务中表现出色，且仅需约1000小时的训练数据，展示了其高效性。

Comments Work in progress

2605.05866 2026-05-11 cs.AI cond-mat.mtrl-sci cs.LG

XDecomposer: Learning Prior-Free Set Decomposition for Multiphase X-ray Diffraction

Hanyu Gao, Bin Cao, Yunyue Su, Tong-Yi Zhang, Qiang Liu

AI总结 XDecomposer 是一种无需先验知识的多相X射线衍射图谱分解框架，能够直接从实验数据中联合识别和分离多个相的晶体结构，无需依赖候选相列表或结构模板。该方法将多相衍射分析建模为集合预测问题，通过引入相查询驱动的分解机制和符合衍射物理规律的重构策略，实现了高精度的源分离和结构表征。实验表明，XDecomposer 在多种化学体系中显著提升了重构精度和相识别能力，为数据驱动的多相XRD分析提供了有效工具。

Comments 28pages, 8figures, 6tables

2605.05806 2026-05-11 cs.LG

Retrieval from Within: An Intrinsic Capability of Attention-Based Models

Elad Hoffer, Yochai Blau, Edan Kinderman, Ron Banner, Daniel Soudry, Boris Ginsburg

AI总结本文研究了基于注意力机制的编码器-解码器模型是否能够直接从其内部表示中进行检索，而非依赖外部检索系统。为此，作者提出了INTRA框架，通过解码器的注意力机制对预编码的证据块进行打分，并直接将其作为生成上下文使用，从而将检索与生成过程统一起来。实验表明，INTRA在问答任务中优于传统检索生成流水线，在证据召回率和端到端答案质量上均表现优异，展示了注意力模型本身已具备可被激发的内在检索能力。

2605.05732 2026-05-11 cs.LG cs.AI

CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning

Md Anwar Hossen, Fatema Siddika, Juan Pablo Munoz, Tanya Roosta, Ali Jannesari

AI总结本文提出了一种名为CRAFT的持续学习框架，旨在解决大语言模型在持续适应新任务时容易出现的灾难性遗忘问题。该方法通过学习低秩干预来调整隐藏表示，而非直接更新模型权重，从而在保持模型原有能力的同时适应新任务。CRAFT结合了任务路由、正则化和表示合并三个阶段，利用KL散度作为统一目标，有效控制遗忘并提升模型性能，实验表明其在多个基准和不同规模模型上均优于基于LoRA的强基线方法。

Comments 24 pages

2605.05693 2026-05-11 cs.AI cs.LG

Saliency-Aware Regularized Quantization Calibration for Large Language Models

Yanlong Zhao, Xiaoyuan Cheng, Huihang Liu, Baihua He, Xinyu Zhang, Harrison Bo Hua Zhu, Wenlong Chen, Li Zeng, Zhuo Sun

AI总结本文提出了一种名为SARQC的新型量化校准方法，用于提升大语言模型在低位宽部署下的性能。该方法通过引入正则化项，显式控制量化权重与原始浮点权重之间的偏差，从而降低因校准数据有限或不具代表性而导致的泛化风险。进一步地，SARQC结合显著性感知的正则化策略，使得量化过程更关注模型中关键部分的权重保持，实验表明该方法在多个大模型任务中有效提升了推理性能，且无需增加额外的推理开销。

2605.05674 2026-05-11 cs.CV cs.AI cs.LG

EGA: Adapting Frozen Encoders for Vector Search with Bounded Out-of-Distribution Degradation

Dongfang Zhao

AI总结该研究针对基于冻结视觉编码器的向量搜索系统在面对未见类别查询时性能下降的问题，提出了一种名为EGA的残差适配器方法。EGA通过零初始化、局部三元组损失和超球面投影三个核心设计，实现了对未见类别区域的有限扰动控制，同时保持对已见类别的充分优化。实验表明，EGA在多个分布外基准测试中显著提升了最差情况下的标签精度，并且适用于多种强大的编码器模型。

Comments added ack and github link

2605.05615 2026-05-11 cs.LG cs.CY

LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites

Lei Jiang, Adrian Ildefonso, Daniel Loveless, Fan Chen

AI总结本文提出了LLMSpace，首个用于建模在人工智能卫星上进行大语言模型推理碳足迹的框架。该方法综合考虑了运行碳排放、制造碳排放、卫星配套子系统、抗辐射硬件以及大语言模型特有的工作负载特征，揭示了在轨推理中碳足迹、延迟、硬件设计和使用寿命之间的关键权衡关系，为可持续的空间大模型推理提供了重要参考。

Comments 12 pages, 4 figures, 6 tables

2605.05583 2026-05-11 cs.AI cs.CL

Belief Memory: Agent Memory Under Partial Observability

Junfeng Liao, Qizhou Wang, Jianing Zhu, Bo Du, Rui Yan, Xiuying Chen

AI总结在部分可观测环境中，智能体依赖外部记忆来积累长期知识，但现有方法通常将每个观测存储为确定性结论，忽略了其固有的不确定性和模糊性，从而导致自我强化的错误。为此，本文提出了一种名为BeliefMem的新方法，通过保留多个候选结论及其概率来存储观测信息，利用Noisy-OR规则动态更新概率，并在检索时同时呈现所有候选及其概率，从而保留不确定性并提升智能体的决策灵活性。实验表明，BeliefMem在多个基准测试中表现出色，尤其在数据有限的情况下仍能取得最佳性能。

2605.05558 2026-05-11 cs.AI cs.CY

Who Prices Cognitive Labor in the Age of Agents? Compute-Anchored Wages

Siqi Zhu

AI总结本文探讨了人工智能代理（AI agents）对认知劳动定价的影响，指出传统认为代理劳动具有高度弹性供给从而压低工资的观点存在机制错误。研究提出，代理本质上是一种将计算资本转化为认知劳动的生产技术，因此决定认知劳动均衡工资的市场应从劳动力市场转移到计算资本市场。基于要素定价理论，作者推导出“计算锚定工资”（CAW）边界，表明在替代性任务中，人类工资受计算资本租金、代理劳动的计算强度及相对生产率等因素的限制，结论表明认知劳动的定价权已不再由劳动力市场主导。

2605.04279 2026-05-11 cs.LG

Gradient Flow Structure and Quantitative Dynamics of Multi-Head Self-Attention

Ayan Pendharkar

AI总结本文研究了多头自注意力机制的动力学行为，揭示其在单位球面上的梯度流结构。通过构建多头能量函数，作者分析了注意力头之间的几何干扰，并识别出阻碍单调性的径向阴影项，提出了保证单调性的充分条件。研究还发现异质注意力头具有加速聚类的特性，并在简化模型中推导出控制聚类行为的关键温度参数，为理解Transformer模型中的聚类与稳定性机制提供了理论依据。

Comments 20 pages, 5 figures

2605.03067 2026-05-11 cs.AI cs.GT

Computing Thiele Rules on Interval Elections and their Generalizations

Dimitris Avramidis, Alexandra Lassota, Ulrike Schmidt-Kraepelin, Adrian Vetta

AI总结本文研究了在区间选举及其扩展领域中计算Thiele规则的问题，特别是比例批准投票（PAV）。尽管在候选人区间（CI）域中，Thiele规则可通过线性规划（LP）在多项式时间内求解，但在选民区间（VI）域中却面临计算复杂性挑战。作者证明了在VI域中，尽管约束矩阵不完全单模，标准LP仍存在整数最优解，并提出了一种快速求解算法。研究进一步扩展到更一般的选民-候选人区间（VCI）域和线性一致（LC）域，揭示了它们之间的包含关系，并提出了LC域的等价定义，为理解这类结构化偏好下的选举规则提供了新视角。

Comments 19 pages

2605.02971 2026-05-11 cs.LG cs.AI cs.CL

Multilingual Safety Alignment via Self-Distillation

Ruiyang Qin, Qingzhuo Wang, Dongrui Liu, Qiang Li, Zhihua Wei, Wen Shen

AI总结大型语言模型在多语言安全对齐方面存在严重问题：它们在高资源语言中具有较强的安全防护能力，但在低资源语言中却极易受到越狱攻击。本文提出了一种跨语言安全能力迁移框架——多语言自蒸馏（MSD），无需依赖各语言的高质量响应数据，即可将高资源语言中的安全能力迁移至低资源语言。该方法引入了双视角安全加权机制（DPSW），通过联合考虑教师模型与学生模型的视角，动态调整安全关键词的惩罚权重，从而提升跨语言安全对齐效果。实验表明，该方法在多种多语言越狱和实用基准测试中均取得了优越的安全性能，并能有效推广到更具挑战性的数据集和未见过的语言。

2605.02881 2026-05-11 cs.RO

MolmoAct2: Action Reasoning Models for Real-world Deployment

Haoquan Fang, Jiafei Duan, Donovan Clay, Sam Wang, Shuo Liu, Weikai Huang, Xiang Fan, Wei-Chuan Tsai, Shirui Chen, Yi Ru Wang, Shanli Xing, Jaemin Cho, Jae Sung Park, Ainaz Eftekhar, Peter Sushko, Karen Farley, Angad Wadhwa, Cole Harrison, Winson Han, Ying-Chun Lee, Eli VanderBilt, Rose Hendrix, Suveen Ellawela, Lucas Ngoo, Joyce Chai, Zhongzheng Ren, Ali Farhadi, Dieter Fox, Ranjay Krishna

AI总结本文提出 MolmoAct2，一种专为实际部署设计的全开放动作推理模型，旨在解决当前视觉-语言-动作（VLA）模型在真实环境中应用时存在的性能与部署限制。研究引入了 MolmoER 作为专用的视觉-语言模型骨干，结合大规模数据训练提升空间与具身推理能力，并发布多个新数据集和 OpenFAST 动作编码器，同时改进模型架构以提升推理效率。实验表明，MolmoAct2 在多个仿真和现实基准测试中优于现有先进模型，显著提升了动作推理的实用性与可靠性。

Comments 31 pages, project page: https://allenai.org/blog/molmoact2

2605.02206 2026-05-11 cs.CV cs.LG

Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score

Abdullah Ahmad Khan, Hamid Laga, Ferdous Sohel

AI总结本文系统分析了多模态机器遗忘任务中评估指标的可靠性问题，指出当前常用的五种指标在不同基准测试中对方法的排名存在显著冲突。研究提出了一种统一质量得分（UQS），通过结合各指标与理想模型距离的相关性进行加权，显著提升了评估的一致性和稳定性，并在多个实验中验证了其有效性。该工作为多模态模型的遗忘评估提供了更可靠的方法指导。

Comments 9 Pages , 6 figures, Neurips 2026

2605.02201 2026-05-11 cs.CV

Super-Resolution of Airborne Laser Scanning Point Clouds for Forest Inventory

Jinyuan Shao, Sangyoong Park, Chunxi Zhao, Ayman Habib, Songlin Fei

AI总结该研究针对航空激光扫描（ALS）点云在森林调查中因点云稀疏和噪声导致的树木个体识别不准确问题，提出了一种基于三维卷积神经网络的深度学习模型3DFSR，用于同时提升点云密度和降低噪声。实验表明，该方法在温带和寒带森林数据集上均优于现有算法，显著提高了树干检测、胸径估计和树干重建的精度。此外，该方法适用于不同密度的点云数据，并可在不同激光雷达平台的数据间通用，无需迁移学习。

详情

英文摘要

Airborne Laser Scanning (ALS) can collect point clouds across large areas, enabling large-scale forest inventory. However, ALS point clouds are sparse and noisy, resulting in inaccurate individual-tree-level forest inventory, such as stem localization and tree size estimation. To overcome this problem, we propose a deep learning model, 3D Forest Super Resolution (3DFSR), to simultaneously improve point density and reduce noise for ALS forest point cloud. 3DFSR is a voxel-based CNN with a U-Net architecture. The proposed 3DFSR is evaluated on ALS point clouds collected in both temperate forests in the U.S. and boreal forests in Germany. Experimental results demonstrate that 3DFSR can generate finer point clouds of tree structure than other state-of-the-art point cloud super-resolution algorithms, achieving 0.249 m Chamfer Distance and 2.711 m Hausdorff Distance. Furthermore, to verify the effectiveness of 3DFSR point clouds in forest inventory, we conduct stem detection, DBH measurements, and stem reconstruction on both original ALS point clouds and 3DFSR enhanced point clouds. We find that stem detection and reconstruction algorithms developed for TLS/MLS point clouds can directly work on our 3DFSR point clouds, and DBH can be derived with circle-fitting method. F1 score of stem detection is improved from 0.71 on original ALS point clouds to 0.97 on 3DFSR point clouds; DBH estimation improves from 13.45 cm RMSE using allometric equations to 6.43 cm using circle fitting; comparing to stems reconstruction from MLS point clouds, stem reconstructed from 3DFSR point clouds has 0.170 m of Chamfer Distance and 0.377 m of Hausdorff Distance, and 0.95 R2 volume estimation. Finally, we find that the proposed 3DFSR is applicable to process point densities from 10 to 1700 points/m2; it also can be generalized across data collected from different LiDAR platforms without transfer learning.

URL PDF HTML ☆

赞 0 踩 0

2605.01999 2026-05-11 cs.AI

TumorXAI: Self-Supervised Deep Learning Framework for Explainable Brain MRI Tumor Classification

Abrar Hossain Zahin, Amit Kumar Saha, Tanvir Mridha, Saifur Rahman, Jannatul Ferdous Prome, Raima Husna, Israt Jahan, Ahmed Wasif Reza

AI总结本文提出了一种名为 TumorXAI 的自监督深度学习框架，用于实现可解释的脑部MRI肿瘤分类。该方法基于ResNet-50网络，结合多种自监督学习方法（如SimCLR、BYOL等）在包含17种肿瘤类型的4,448张MRI图像数据集上进行训练与评估，显著提升了分类性能，并在有限标签情况下优于传统监督模型。通过引入Grad-CAM等可解释性技术，模型不仅实现了高精度分类，还增强了决策过程的可视化与可解释性。

Comments 16 pages, 9 figures, 6 Tables

2605.01862 2026-05-11 cs.LG

QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL

Xing Lei, Jincheng Wang, Xuetao Zhang, Donglin Wang

AI总结该论文提出了一种名为QHyer的新型离线目标条件强化学习框架，用于解决现实环境中数据部分可观测和历史依赖所带来的挑战。QHyer通过引入一个基于状态条件的Q估计器替代传统的返回值目标（RTG），增强了不同轨迹之间的行为拼接能力，并采用门控混合注意力-Mamba结构，在保持局部动态的同时实现内容自适应的历史压缩。实验表明，QHyer在非马尔可夫和马尔可夫数据集上均取得了最先进的性能，验证了其在多样化场景中的有效性。

Comments ICML 2026

2605.01717 2026-05-11 cs.CL cs.AI

TCDA: Thread-Constrained Discourse-Aware Modeling for Conversational Sentiment Quadruple Analysis

Xinran Li, Xinze Che, Yifan Lyu, Zhiqi Huang, Xiujuan Xu

AI总结本文研究多轮对话中的情感四元组分析问题，旨在捕捉对话中复杂的语义关系。为解决现有方法在结构噪声、时序建模和距离稀释问题上的不足，提出了一种结合线程约束有向无环图（TC-DAG）和话语感知旋转位置嵌入（D-RoPE）的新框架，有效提升了对话情感分析的准确性和鲁棒性。实验表明，该方法在两个基准数据集上取得了当前最优的性能。

Comments Accepted to IJCAI 2026 (Main Track)

2605.01459 2026-05-11 cs.CV cs.AI

SRGAN-CKAN: Expressive Super-Resolution with Nonlinear Functional Operators under Minimal Resources

Roberto Isai Navaro-Aviña, Eduardo Said Merin-Martinez, Andres Mendez-Vazquez, Eduardo Rodriguez-Tello

AI总结本文提出了一种名为SRGAN-CKAN的混合超分辨率框架，旨在在有限计算资源下提升图像超分辨率的表达能力。该方法通过将卷积操作重新表述为基于样条的非线性块变换，引入卷积型Kolmogorov-Arnold网络（CKAN），从而在局部区域更有效地建模复杂结构和高频纹理。实验表明，该方法在保持重建保真度的同时提升了感知质量，在计算资源受限的情况下表现出优越的效率和性能。

2605.01333 2026-05-11 cs.CL

OralMLLM-Bench: Evaluating Cognitive Capabilities of Multimodal Large Language Models in Dental Practice

Rongyang Wang, Shuang Zhou, Jiashuo Wang, Wenya Xie, Xiaoxia Che

AI总结本文提出了一种名为OralMLLM-Bench的综合基准，用于评估多模态大语言模型在牙科影像分析中的认知能力。该基准涵盖根尖片、全景片和侧位头颅片三种关键影像模态，定义了感知、理解、预测和决策四个认知类别，并基于公开数据集构建了27个临床相关任务，包含3,820份专家评估结果。研究对比了六种前沿模型与临床医生的性能差异，揭示了模型的优势与局限，并为改进模型提供了建议，有助于推动与临床认知和工作流程更契合的牙科人工智能系统发展。

Comments 21 pages, 4 figures, 5 tables

2605.01240 2026-05-11 cs.LG cs.AI

Rhamba: Region-Aware Hybrid Attention-Mamba Framework for Self-Supervised Learning in Resting-State fMRI

Ruthwik Reddy Doodipala, Pankaj Pandey, Pratheek Eranki, Carolina Torres-Rojas, Manob Jyoti Saikia, Ranganatha Sitaram

AI总结本文提出了一种名为Rhamba的区域感知混合注意力-Mamba框架，用于静息态功能磁共振成像（fMRI）的自监督学习。该方法结合解剖学引导的掩码策略与混合的注意力-Mamba架构，通过不同空间特异性的掩码策略在ABIDE数据集上进行预训练，并在精神分裂症和注意缺陷多动障碍分类任务中取得了优越的性能。实验表明，混合架构中的Mamba-Attention（MA）配置在多个数据集上表现最佳，且模型预测的可解释性分析揭示了掩码策略与网络结构之间的复杂交互关系。

详情

英文摘要

Self-supervised pretraining is promising for large-scale neuroimaging, yet the impact of region-aware masking and hybrid sequence modeling remains underexplored. In this work, we introduce Rhamba, a region-aware pretraining framework that integrates anatomically guided masking with hybrid Attention-Mamba architectures for resting state functional magnetic resonance imaging (fMRI) analysis. Models were pretrained on the ABIDE dataset using region-aligned patch embeddings and three masking strategies (Any, Majority, and Pure) with increasing spatial specificity. We evaluated four architectural variants: a Mamba only model, an Alternate architecture with interleaved Mamba and Attention blocks, and two hybrid encoder-decoder configurations (Attention-Mamba (AM) and Mamba-Attention (MA)). The pretrained models were fine-tuned on downstream classification tasks using the COBRE and ADHD-200 datasets for schizophrenia and attention-deficit/hyperactivity disorder discrimination. We employed Integrated Gradients, an explainable AI method, to identify the brain regions contributing to model predictions. Masking strategy strongly influenced reconstruction behavior, with reconstruction loss following a consistent ordering (Any > Majority > Pure). However, this trend did not directly translate into downstream performance, where differences were modest and dataset-dependent. The hybrid architecture with the MA configuration achieved the highest average AUROC across both datasets, and Rhamba outperformed state-of-the-art methods in comparative evaluation. Region-wise analysis showed that peak performance depends on the interaction between masking strategy and architecture rather than a single dominant configuration. Overall, Rhamba offers a flexible framework for balancing interpretability, scalability, and performance in large-scale fMRI representation learning.

URL PDF HTML ☆

赞 0 踩 0

2605.01195 2026-05-11 cs.RO

TAIL-Safe: Task-Agnostic Safety Monitoring for Imitation Learning Policies

Riad Ahmed, Momotaz Begum

AI总结 TAIL-Safe 是一种面向模仿学习策略的安全监控方法，旨在解决其在实际部署中因初始条件敏感和近似误差导致的失败问题。该方法通过构建一个基于可见性、可识别性和可抓取性三个任务无关指标的连续Q值函数，识别策略能够安全执行任务的状态-动作集合，并利用梯度上升机制引导策略回归安全区域。实验表明，TAIL-Safe 能有效提升模仿学习策略在运行时扰动下的任务成功率。

2605.01006 2026-05-11 cs.CL cs.CY

Can AI Debias the News? LLM Interventions Improve Cross-Partisan Receptivity but LLMs Overestimate Their Own Effectiveness

Faisal Feroz, Jonas R. Kunst

AI总结本研究探讨了大型语言模型（LLM）在减少新闻偏见、提升跨党派接受度方面的潜力与局限。通过两项预注册实验，研究发现对自由派新闻标题进行实质性重述的干预措施，能够显著提升保守派读者的信任感和参与意愿，而对表面语言的轻微调整则无明显效果。研究还指出，尽管LLM在模拟环境中表现出一定的干预效果，但其对自身干预效果的评估存在量化不准确和心理真实性不足的问题，表明当前模型仍需人类监督以确保干预的有效性。

2605.00834 2026-05-11 cs.LG cs.CC cs.IT math.IT

Polynomial-Time Optimal Group Selection via the Double-Commutator Eigenvalue Problem

Mitchell A. Thornton

AI总结该论文研究了在代数多样性框架下，如何高效地从高维观测中选择最优的群结构以匹配其协方差特性。传统方法需要指数时间枚举对称群的子群，而本文通过将问题转化为协方差矩阵的双交换子广义特征值问题，提出了一种多项式时间算法，能够在闭式中直接构造最优群生成元。该方法不仅计算高效，还提供了可验证的最优性保证，为群论、矩阵分析与统计估计之间建立了一种新的理论联系。

Comments v2: 2 theorems, 4 open problems, §X.A correction added; 1 reference added

详情

英文摘要

The algebraic diversity framework generalizes temporal averaging over multiple observations to algebraic group action on a single observation for second-order statistical estimation. The central open problem in this framework is $\textit{group selection}$: given an $M$-dimensional observation with unknown covariance structure, find the finite group whose spectral decomposition best matches the covariance. Naive enumeration of all subgroups of the symmetric group $S_M$ requires exponential time in $M$. We prove that this combinatorial problem reduces to a generalized eigenvalue problem derived from the double commutator of the covariance matrix, yielding a polynomial-time algorithm with complexity $O(d^2M^2 + d^3)$, where $d$ is the dimension of a generator basis. The minimum eigenvector of the double-commutator matrix directly constructs the optimal group generator in closed form, with no iterative optimization. The reduction is exact: the double-commutator minimum eigenvalue is zero if and only if the optimal generator lies in the span of the basis, and its magnitude provides a certifiable optimality gap when it does not. This problem does not appear in the standard catalogs of computational complexity (Garey and Johnson, 1979) and represents a new class linking group theory, matrix analysis, and statistical estimation. We establish connections to independent component analysis (JADE), structured matrix nearness problems, and simultaneous matrix diagonalization, and we show that the double-commutator formulation is the unique approach that is simultaneously polynomial-time, closed-form, and certifiable. We extend the framework to non-Abelian symmetry recovery via a Sequential GEVP with deflation, and add two identifiability theorems characterizing the commutant-lattice ambiguity and the dichotomy on whether $\mathrm{Aut}(\mathbf{R})$ recovers a generative subgroup or only a supergroup.

URL PDF HTML ☆

赞 0 踩 0

2605.00663 2026-05-11 cs.RO cs.CV

Affordance Agent Harness: Verification-Gated Skill Orchestration

Haojian Huang, Jiahao Shi, Yinchuan Li, Yingcong Chen

AI总结该论文提出了一种名为“Affordance Agent Harness”的闭环运行系统，旨在解决开放世界场景中智能体交互区域识别的问题。该系统通过整合多种异构技能，结合经验记忆和成本控制机制，实现了对交互区域的可靠判定，并利用一个路由器动态选择和参数化技能。核心贡献在于引入了一个验证器，通过自一致性、跨尺度稳定性和证据充分性来判断是否可以做出交互决策，从而在保证推理效率的同时提升交互定位的准确性。

Comments 43 pages, 22 figures, 8 tables. Ongoing work

2605.00425 2026-05-11 cs.AI

AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

Haotian Zhao, Songlin Zhou, Yuxin Zhang, Stephen S. -T. Yau, Wenyu Zhang, Lun Tian, Tianshu Zhu, Yifeng Huang, Yucheng Zeng, Jingnan Gu, Daxiang Dong, Jianmin Wu

AI总结本文提出了一种名为AEM的监督自由信用分配方法，用于多轮智能体强化学习，旨在解决稀疏奖励下难以分配信用的问题。AEM通过自适应调节熵动态，在探索与利用之间取得更好的平衡，其核心在于将熵动态从单个词元级别提升到完整响应级别，从而减少采样噪声的影响，并更准确地匹配大型语言模型的有效动作粒度。实验表明，AEM在多个基准任务上显著提升了强化学习基线的性能。

Comments 30 pages