arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.08906 2026-03-11 cs.CV physics.med-ph

Multi-Kernel Gated Decoder Adapters for Robust Multi-Task Thyroid Ultrasound under Cross-Center Shift

Maziar Sabouri, Nourhan Bayasi, Arman Rahmim

详情

英文摘要

Thyroid ultrasound (US) automation couples two competing requirements: global, geometry-driven reasoning for nodule delineation and local, texture-driven reasoning for malignancy risk assessment. Under cross-center domain shift, these cues degrade asymmetrically, yet most multi-task pipelines rely on a single shared backbone, often inducing negative transfer. In this paper, we characterize this interference across CNN (ResNet34) and medical ViT (MedSAM) backbones, and observe a consistent trend: ViTs transfer geometric priors that benefit segmentation, whereas CNNs more reliably preserve texture cues for malignancy discrimination under strong shift and artifacts. Motivated by this failure mode, we propose a lightweight family of decoder-side adapters, the Multi-Kernel Gated Adapter (MKGA) and a residual variant (ResMKGA), which refine multi-scale skip features using complementary receptive fields and apply semantic, context-conditioned gating to suppress artifact-prone content before fusion. Across two US benchmarks, the proposed adapters improve cross-center robustness: they strengthen out-of-domain segmentation and, in the CNN setting, yield clear gains in clinical TI-RADS diagnostic accuracy compared to standard multi-task baselines. Code and models will be released.

URL PDF HTML ☆

赞 0 踩 0

2603.08905 2026-03-11 cs.RO

Proprioceptive Safe Active Navigation and Exploration for Planetary Environments

Matthew Y. Jiang, Feifei Qian, Shipeng Liu

Comments 9 pages, 7 figures

2603.08898 2026-03-11 cs.CV

Towards Visual Query Segmentation in the Wild

Bing Fan, Minghao Li, Hanzhi Zhang, Shaohua Dong, Naga Prudhvi Mareedu, Weishi Shi, Yunhe Feng, Yan Huang, Heng Fan

2603.08897 2026-03-11 cs.CV

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, Long Cheng, Abolfazl Razi, Mert D. Pesé

Comments Accepted at the 2025 IEEE Intelligent Vehicles Symposium (IV 2025)

2603.08877 2026-03-11 cs.AI

Quantifying the Accuracy and Cost Impact of Design Decisions in Budget-Constrained Agentic LLM Search

Kyle McCleary, James Ghawaly

Comments Accepted in 2026 Language Resources and Evaluation Conference (LREC)

2603.08869 2026-03-11 cs.CL

One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations

Sripad Karne

Comments Accepted at the UCRL Workshop at ICLR 2026

2603.08863 2026-03-11 cs.RO

Adaptive SINDy: Residual Force System Identification Based UAV Disturbance Rejection

Fawad Mehboob, Amir Atef Habel, Roohan Ahmed Khan, Mikhail Derevianchenko, Clement Fortin, Dzmitry Tsetserukou

2603.08862 2026-03-11 cs.RO cs.LG

APPLV: Adaptive Planner Parameter Learning from Vision-Language-Action Model

Yuanjie Lu, Beichen Wang, Zhengqi Wu, Yang Li, Xiaomin Lin, Chengzhi Mao, Xuesu Xiao

2603.08860 2026-03-11 cs.RO cs.SY eess.SY

SEP-NMPC: Safety Enhanced Passivity-Based Nonlinear Model Predictive Control for a UAV Slung Payload System

Seyedreza Rezaei, Junjie Kang, Amaldev Haridevan, Jinjun Shan

Comments Accepted at ICRA 2026

2603.08859 2026-03-11 cs.LG

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models

John Cooper, Ilias Diakonikolas, Mingchen Ma, Frederic Sala

2603.08852 2026-03-11 cs.AI cs.MA cs.SE

LDP: An Identity-Aware Protocol for Multi-Agent LLM Systems

Sunil Prakash

Comments 16 pages, 9 figures, 8 tables, 4 appendices

2603.08850 2026-03-11 cs.CV

HECTOR: Hybrid Editable Compositional Object References for Video Generation

Guofeng Zhang, Angtian Wang, Jacob Zhiyuan Fang, Liming Jiang, Haotian Yang, Alan Yuille, Chongyang Ma

2603.08844 2026-03-11 cs.CV cs.AI

A Lightweight Multi-Cancer Tumor Localization Framework for Deployable Digital Pathology

Brian Isett, Rebekah Dadey, Aofei Li, Ryan C. Augustin, Kate Smith, Aatur D. Singhi, Qiangqiang Gu, Riyue Bao

Comments 9 pages, 2 figures

2603.08835 2026-03-11 cs.AI cs.CL cs.LG

MASEval: Extending Multi-Agent Evaluation from Models to Systems

Cornelius Emde, Alexander Rubinstein, Anmol Goel, Ahmed Heakl, Sangdoo Yun, Seong Joon Oh, Martin Gubri

2603.08831 2026-03-11 cs.RO cs.SY eess.SY

Predictive Control with Indirect Adaptive Laws for Payload Transportation by Quadrupedal Robots

Leila Amanzadeh, Taizoon Chunawala, Randall T. Fawcett, Alexander Leonessa, Kaveh Akbari Hamed

Comments 8 pages, 6 figures. Published in IEEE Robotics and Automation Letters

详情

DOI: 10.1109/LRA.2024.3474550
Journal ref: IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 10359-10366, Nov. 2024

英文摘要

This paper formally develops a novel hierarchical planning and control framework for robust payload transportation by quadrupedal robots, integrating a model predictive control (MPC) algorithm with a gradient-descent-based adaptive updating law. At the framework's high level, an indirect adaptive law estimates the unknown parameters of the reduced-order (template) locomotion model under varying payloads. These estimated parameters feed into an MPC algorithm for real-time trajectory planning, incorporating a convex stability criterion within the MPC constraints to ensure the stability of the template model's estimation error. The optimal reduced-order trajectories generated by the high-level adaptive MPC (AMPC) are then passed to a low-level nonlinear whole-body controller (WBC) for tracking. Extensive numerical investigations validate the framework's capabilities, showcasing the robot's proficiency in transporting unmodeled, unknown static payloads up to 109% in experiments on flat terrains and 91% on rough experimental terrains. The robot also successfully manages dynamic payloads with 73% of its mass on rough terrains. Performance comparisons with a normal MPC and an L1 MPC indicate a significant improvement. Furthermore, comprehensive hardware experiments conducted in indoor and outdoor environments confirm the method's efficacy on rough terrains despite uncertainties such as payload variations, push disturbances, and obstacles.

URL PDF HTML ☆

赞 0 踩 0

2603.08827 2026-03-11 cs.CV

Computer Vision-Based Vehicle Allotment System using Perspective Mapping

Prachi Nandi, Sonakshi Satapathy, Suchismita Chinara

2603.08825 2026-03-11 cs.LG cs.AI

Are Expressive Encoders Necessary for Discrete Graph Generation?

Jay Revolinsky, Harry Shomer, Jiliang Tang

Comments 25 pages, 15 figures, 10 tables

2603.08824 2026-03-11 cs.LG

SoftJAX & SoftTorch: Empowering Automatic Differentiation Libraries with Informative Gradients

Anselm Paulus, A. René Geist, Vít Musil, Sebastian Hoffmann, Onur Beker, Georg Martius

2603.08821 2026-03-11 cs.RO

Impact of Different Failures on a Robot's Perceived Reliability

Andrew Violette, Zhanxin Wu, Haruki Nishimura, Masha Itkina, Leticia Priebe Rocha, Mark Zolotas, Guy Hoffman, Hadas Kress-Gazit

Comments Accepted to ICRA 2026. 8 pages, 6 figures

2603.08817 2026-03-11 cs.RO

HMR-1: Hierarchical Massage Robot with Vision-Language-Model for Embodied Healthcare

Rongtao Xu, Mingming Yu, Xiaofeng Han, Yu Zhang, Kaiyi Hu, Zhe Feng, Zenghuang Fu, Changwei Wang, Weiliang Meng, Xiaopeng Zhang

2603.08814 2026-03-11 cs.RO cs.AI cs.ET cs.MA

Scale-Plan: Scalable Language-Enabled Task Planning for Heterogeneous Multi-Robot Teams

Piyush Gupta, Sangjae Bae, Jiachen Li, David Isele

2603.08812 2026-03-11 cs.CV

VisionCreator-R1: A Reflection-Enhanced Native Visual-Generation Agentic Model

Jinxiang Lai, Wenzhe Zhao, Zexin Lu, Hualei Zhang, Qinyu Yang, Rongwei Quan, Zhimin Li, Shuai Shao, Song Guo, Qinglin Lu

2603.08810 2026-03-11 cs.RO

Age-Related Differences in the Perception of Eye-Gaze from a Social Robot

Lucas Morillo-Mendez, Martien G. S. Schrooten, Oscar Martinez Mozos

Comments This is the pre-print version. Final publication available at https://doi.org/10.1007/978-3-030-90525-5_30

2603.08803 2026-03-11 cs.LG stat.ML

The Temporal Markov Transition Field

Michael Leznik

Comments 13 pages, 2 figures

2603.08800 2026-03-11 cs.CV

Granulon: Awakening Pixel-Level Visual Encoders with Adaptive Multi-Granularity Semantics for MLLM

Junyuan Mao, Qiankun Li, Linghao Meng, Zhicheng He, Xinliang Zhou, Kun Wang, Yang Liu, Yueming Jin

2603.08773 2026-03-11 cs.LG cs.AI stat.ML

Multi-level meta-reinforcement learning with skill-based curriculum

Sichen Yang, Mauro Maggioni

Comments 78 pages, 12 figures

详情

英文摘要

We consider problems in sequential decision making with natural multi-level structure, where sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure has remained a longstanding challenge; we describe an efficient multi-level procedure for repeatedly compressing Markov decision processes (MDPs), wherein a parametric family of policies at one level is treated as single actions in the compressed MDPs at higher levels, while preserving the semantic meanings and structure of the original MDP, and mimicking the natural logic to address a complex MDP. Higher-level MDPs are themselves independent MDPs with less stochasticity, and may be solved using existing algorithms. As a byproduct, spatial or temporal scales may be coarsened at higher levels, making it more efficient to find long-term optimal policies. The multi-level representation delivered by this procedure decouples sub-tasks from each other and usually greatly reduces unnecessary stochasticity and the policy search space, leading to fewer iterations and computations when solving the MDPs. A second fundamental aspect of this work is that these multi-level decompositions plus the factorization of policies into embeddings (problem-specific) and skills (including higher-order functions) yield new transfer opportunities of skills across different problems and different levels. This whole process is framed within curriculum learning, wherein a teacher organizes the student agent's learning process in a way that gradually increases the difficulty of tasks and and promotes transfer across MDPs and levels within and across curricula. The consistency of this framework and its benefits can be guaranteed under mild assumptions. We demonstrate abstraction, transferability, and curriculum learning in examples, including MazeBase+, a more complex variant of the MazeBase example.

URL PDF HTML ☆

赞 0 踩 0

2603.08763 2026-03-11 cs.LG cs.RO

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

Kaushik Roy, Giovanni D'urso, Nicholas Lawrance, Brendan Tidd, Peyman Moghadam

Comments IEEE International Conference on Robotics & Automation (ICRA) 2026

2603.08759 2026-03-11 cs.SD cs.AI

EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation

Sahal Sajeer, Krish Patel, Oscar Chung, Joel Song Bae

Comments Published in CUCAI 2026 conference proceedings

2603.08758 2026-03-11 cs.LG cs.AI

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

Alejandro García-Castellanos, Gijs Bellaard, Remco Duits, Daniel Pelt, Erik J Bekkers

2603.08754 2026-03-11 cs.LG cs.AI

Hindsight Credit Assignment for Long-Horizon LLM Agents

Hui-Ze Tan, Xiao-Wen Yang, Hao Chen, Jie-Jing Shao, Yi Wen, Yuteng Shen, Weihong Luo, Xiku Du, Lan-Zhe Guo, Yu-Feng Li