arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.08128 2026-03-10 cs.RO cs.LG

TRIAGE: Type-Routed Interventions via Aleatoric-Epistemic Gated Estimation in Robotic Manipulation and Adaptive Perception -- Don't Treat All Uncertainty the Same

Divake Kumar, Sina Tayebati, Devashri Naik, Patrick Poggi, Amanda Sofie Rios, Nilesh Ahuja, Amit Ranjan Trivedi

详情

英文摘要

Most uncertainty-aware robotic systems collapse prediction uncertainty into a single scalar score and use it to trigger uniform corrective responses. This aggregation obscures whether uncertainty arises from corrupted observations or from mismatch between the learned model and the true system dynamics. As a result, corrective actions may be applied to the wrong component of the closed loop, degrading performance relative to leaving the policy unchanged. We introduce a lightweight post hoc framework that decomposes uncertainty into aleatoric and epistemic components and uses these signals to regulate system responses at inference time. Aleatoric uncertainty is estimated from deviations in the observation distribution using a Mahalanobis density model, while epistemic uncertainty is detected using a noise robust forward dynamics ensemble that isolates model mismatch from measurement corruption. The two signals remain empirically near orthogonal during closed loop execution and enable type specific responses. High aleatoric uncertainty triggers observation recovery, while high epistemic uncertainty moderates control actions. The same signals also regulate adaptive perception by guiding model capacity selection during tracking inference. Experiments demonstrate consistent improvements across both control and perception tasks. In robotic manipulation, the decomposed controller improves task success from 59.4% to 80.4% under compound perturbations and outperforms a combined uncertainty baseline by up to 21.0%. In adaptive tracking inference on MOT17, uncertainty-guided model selection reduces average compute by 58.2% relative to a fixed high capacity detector while preserving detection quality within 0.4%. Code and demo videos are available at https://divake.github.io/uncertainty-decomposition/.

URL PDF HTML ☆

赞 0 踩 0

2603.08127 2026-03-10 cs.CL

EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery

Yougang Lyu, Xi Zhang, Xinhao Yi, Yuyue Zhao, Shuyu Guo, Wenxiang Hu, Jan Piotrowski, Jakub Kaliski, Jacopo Urbani, Zaiqiao Meng, Lun Zhou, Xiaohui Yan

详情

英文摘要

The increasing adoption of Large Language Models (LLMs) has enabled AI scientists to perform complex end-to-end scientific discovery tasks requiring coordination of specialized roles, including idea generation and experimental execution. However, most state-of-the-art AI scientist systems rely on static, hand-designed pipelines and fail to adapt based on accumulated interaction histories. As a result, these systems overlook promising research directions, repeat failed experiments, and pursue infeasible ideas. To address this, we introduce EvoScientist, an evolving multi-agent AI scientist framework that continuously improves research strategies through persistent memory and self-evolution. EvoScientist comprises three specialized agents: a Researcher Agent (RA) for scientific idea generation, an Engineer Agent (EA) for experiment implementation and execution, and an Evolution Manager Agent (EMA) that distills insights from prior interactions into reusable knowledge. EvoScientist contains two persistent memory modules: (i) an ideation memory, which summarizes feasible research directions from top-ranked ideas while recording previously unsuccessful directions; and (ii) an experimentation memory, which captures effective data processing and model training strategies derived from code search trajectories and best-performing implementations. These modules enable the RA and EA to retrieve relevant prior strategies, improving idea quality and code execution success rates over time. Experiments show that EvoScientist outperforms 7 open-source and commercial state-of-the-art systems in scientific idea generation, achieving higher novelty, feasibility, relevance, and clarity via automatic and human evaluation. EvoScientist also substantially improves code execution success rates through multi-agent evolution, demonstrating persistent memory's effectiveness for end-to-end scientific discovery.

URL PDF HTML ☆

赞 0 踩 0

2603.08126 2026-03-10 cs.CV cs.AI cs.LG cs.SD eess.AS

Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows

Shentong Mo, Yibing Song

2603.08125 2026-03-10 cs.CL

Ramsa: A Large Sociolinguistically Rich Emirati Arabic Speech Corpus for ASR and TTS

Rania Al-Sabbagh

2603.08124 2026-03-10 cs.RO cs.AI cs.LG

SaiVLA-0: Cerebrum--Pons--Cerebellum Tripartite Architecture for Compute-Aware Vision-Language-Action

Xiang Shi, Wenlong Huang, Menglin Zou, Xinhai Sun

Comments 14 pages, 3 figures

2603.08122 2026-03-10 cs.RO

Towards Human-Like Manipulation through RL-Augmented Teleoperation and Mixture-of-Dexterous-Experts VLA

Tutian Tang, Xingyu Ji, Wanli Xing, Ce Hao, Wenqiang Xu, Lin Shao, Cewu Lu, Qiaojun Yu, Jiangmiao Pang, Kaifeng Zhang

Comments Project Homepage: https://sites.google.com/view/mode-vla

2603.08118 2026-03-10 cs.LG

Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting

Zhongjian Qiao, Jiafei Lyu, Boxiang Lyu, Yao Shu, Siyang Gao, Shuang Qiu

Comments Accepted at ICLR 2026

2603.08113 2026-03-10 cs.CV

SAMoE-VLA: A Scene Adaptive Mixture-of-Experts Vision-Language-Action Model for Autonomous Driving

Zihan You, Hongwei Liu, Chenxu Dang, Zhe Wang, Sining Ang, Aoqi Wang, Yan Wang

2603.08111 2026-03-10 cs.RO

DeReCo: Decoupling Representation and Coordination Learning for Object-Adaptive Decentralized Multi-Robot Cooperative Transport

Kazuki Shibata, Ryosuke Sota, Shandil Dhiresh Bosch, Yuki Kadokawa, Tsurumine Yoshihisa, Takamitsu Matsubara

Comments 9 pages, 7 figures

详情

英文摘要

Generalizing decentralized multi-robot cooperative transport across objects with diverse shapes and physical properties remains a fundamental challenge. Under decentralized execution, two key challenges arise: object-dependent representation learning under partial observability and coordination learning in multi-agent reinforcement learning (MARL) under non-stationarity. A typical approach jointly optimizes object-dependent representations and coordinated policies in an end-to-end manner while randomizing object shapes and physical properties during training. However, this joint optimization tightly couples representation and coordination learning, introducing bidirectional interference: inaccurate representations under partial observability destabilize coordination learning, while non-stationarity in MARL further degrades representation learning, resulting in sample-inefficient training. To address this structural coupling, we propose DeReCo, a novel MARL framework that decouples representation and coordination learning for object-adaptive multi-robot cooperative transport, improving sample efficiency and generalization across objects and transport scenarios. DeReCo adopts a three-stage training strategy: (1) centralized coordination learning with privileged object information, (2) reconstruction of object-dependent representations from local observations, and (3) progressive removal of privileged information for decentralized execution. This decoupling mitigates interference between representation and coordination learning and enables stable and sample-efficient training. Experimental results show that DeReCo outperforms baselines in simulation on three training objects, generalizes to six unseen objects with varying masses and friction coefficients, and achieves superior performance on two unseen objects in real-robot experiments.

URL PDF HTML ☆

赞 0 踩 0

2603.08100 2026-03-10 cs.CV

Adaptive MLP Pruning for Large Vision Transformers

Chengchao Shen

2603.08097 2026-03-10 cs.SD

PathBench: Speech Intelligibility Benchmark for Automatic Pathological Speech Assessment

Bence Mark Halpern, Thomas Tienkamp, Defne Abur, Tomoki Toda

Comments 5 pages, 1 table. Submitted to Interspeech 2026

2603.08095 2026-03-10 cs.CL cs.AI cs.LG

DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning

Chi-Min Chan, Ehsan Hajiramezanali, Xiner Li, Edward De Brouwer, Carl Edwards, Wei Xue, Sirui Han, Yike Guo, Gabriele Scalia

2603.08091 2026-03-10 cs.CL

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Hongli Zhou, Hui Huang, Rui Zhang, Kehai Chen, Bing Xu, Conghui Zhu, Tiejun Zhao, Muyun Yang

2603.08089 2026-03-10 cs.RO

Adaptive Vision-Based Control of Redundant Robots with Null-Space Interaction for Human-Robot Collaboration

Xiangjie Yan, Chen Chen, Xiang Li

2603.08088 2026-03-10 cs.LG cs.PL

EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs

Chang Han, Yijie Hu, Jingling Liu

Comments 14 pages. 7 figures

2603.08086 2026-03-10 cs.CV

From Reactive to Map-Based AI: Tuned Local LLMs for Semantic Zone Inference in Object-Goal Navigation

Yudai Noda, Kanji Tanaka

Comments 6 pages, 5 figures, technical report

2603.08083 2026-03-10 cs.CL

High-Fidelity Pruning for Large Language Models

Yijun Zhu, Jianxin Wang, Chengchao Shen

2603.08082 2026-03-10 cs.LG

Tiny Autoregressive Recursive Models

Paulius Rauba, Claudio Fanconi, Mihaela van der Schaar

2603.08075 2026-03-10 cs.CV

TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery

Yanan Wu, Yuhan Yan, Tailai Chen, Zhixiang Chi, ZiZhang Wu, Yi Jin, Yang Wang, Zhenbo Li

Comments 14 pages, 6 figures, accepted by CVPR 2026

2603.08069 2026-03-10 cs.CV

Synthetic Defect Image Generation for Power Line Insulator Inspection Using Multimodal Large Language Models

Xuesong Wang, Caisheng Wang

Comments Submitted to Engineering Applications of Artificial Intelligence, Feb. 16, 2026

2603.08068 2026-03-10 cs.AI

In-Context Reinforcement Learning for Tool Use in Large Language Models

Yaoqi Ye, Yiran Zhao, Keyu Duan, Zeyu Zheng, Kenji Kawaguchi, Cihang Xie, Michael Qizhe Shieh

2603.08062 2026-03-10 cs.LG q-bio.GN

Adversarial Domain Adaptation Enables Knowledge Transfer Across Heterogeneous RNA-Seq Datasets

Kevin Dradjat, Massinissa Hamidi, Blaise Hanczar

Comments 7 pages, 5 figures. Submitted to ECCB 2026

2603.08059 2026-03-10 cs.CV cs.AI

ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning

Yiran Zhao, Yaoqi Ye, Xiang Liu, Michael Qizhe Shieh, Trung Bui

2603.08058 2026-03-10 cs.LG

Stabilized Fine-Tuning with LoRA in Federated Learning: Mitigating the Side Effect of Client Size and Rank via the Scaling Factor

Jiayu Huang, Xiaohu Wu, Tiantian He, Qicheng Lao

2603.08057 2026-03-10 cs.RO cs.CV

See and Switch: Vision-Based Branching for Interactive Robot-Skill Programming

Petr Vanc, Jan Kristof Behrens, Václav Hlaváč, Karla Stepanova

Comments 8 pages, 11 figures

2603.08055 2026-03-10 cs.CV cs.AI

Speed3R: Sparse Feed-forward 3D Reconstruction Models

Weining Ren, Xiao Tan, Kai Han

Comments CVPR 2026 Findings, project page: https://visual-ai.github.io/speed3r/

2603.08049 2026-03-10 cs.CL

Examining the Role of YouTube Production and Consumption Dynamics on the Formation of Extreme Ideologies

Sarmad Chandio, Rishab Nithyanand

2603.08048 2026-03-10 cs.AI

S2S-FDD: Bridging Industrial Time Series and Natural Language for Explainable Zero-shot Fault Diagnosis

Baoxue Li, Chunhui Zhao

2603.08046 2026-03-10 cs.SD eess.AS

WhispEar: A Bi-directional Framework for Scaling Whispered Speech Conversion via Pseudo-Parallel Whisper Generation

Zihao Fang, Yingda Shen, Zifan Guan, Tongtong Song, Zhenyi Liu, Zhizheng Wu

Comments Submitted to Interspeech 2026

2603.08035 2026-03-10 cs.AI cs.LG

CDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward Modeling

Dengcan Liu, Fengkai Yang, Xiaohan Wang, Shurui Yan, Jiajun Chai, Jiahao Li, Yikun Ban, Zhendong Mao, Wei Lin, Guojun Yin