arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.23899 2026-03-02 cs.CV cs.AI cs.LG

Experience-Guided Self-Adaptive Cascaded Agents for Breast Cancer Screening and Diagnosis with Reduced Biopsy Referrals

Pramit Saha, Mohammad Alsharid, Joshua Strong, J. Alison Noble

详情

英文摘要

We propose an experience-guided cascaded multi-agent framework for Breast Ultrasound Screening and Diagnosis, called BUSD-Agent, that aims to reduce diagnostic escalation and unnecessary biopsy referrals. Our framework models screening and diagnosis as a two-stage, selective decision-making process. A lightweight `screening clinic' agent, restricted to classification models as tools, selectively filters out benign and normal cases from further diagnostic escalation when malignancy risk and uncertainty are estimated as low. Cases that have higher risks are escalated to the `diagnostic clinic' agent, which integrates richer perception and radiological description tools to make a secondary decision on biopsy referral. To improve agent performance, past records of pathology-confirmed outcomes along with image embeddings, model predictions, and historical agent actions are stored in a memory bank as structured decision trajectories. For each new case, BUSD-Agent retrieves similar past cases based on image, model response and confidence similarity to condition the agent's current decision policy. This enables retrieval-conditioned in-context adaptation that dynamically adjusts model trust and escalation thresholds from prior experiences without parameter updates. Evaluation across 10 breast ultrasound datasets shows that the proposed experience-guided workflow reduces diagnostic escalation in BUSD-Agent from 84.95% to 58.72% and overall biopsy referrals from 59.50% to 37.08%, compared to the same architecture without trajectory conditioning, while improving average screening specificity by 68.48% and diagnostic specificity by 6.33%.

URL PDF HTML ☆

赞 0 踩 0

2602.23898 2026-03-02 cs.CV cs.AI cs.CL

Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks

Qihua Dong, Kuo Yang, Lin Ju, Handong Zhao, Yitian Zhang, Yizhou Wang, Huimin Zeng, Jianglin Lu, Yun Fu

Comments ICLR 2026

2602.23896 2026-03-02 cs.RO

TSC: Topology-Conditioned Stackelberg Coordination for Multi-Agent Reinforcement Learning in Interactive Driving

Xiaotong Zhang, Gang Xiong, Yuanjing Wang, Siyu Teng, Alois Knoll, Long Chen

Comments 12 pages, 8 figures

2602.23894 2026-03-02 cs.CV

SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction

Xavier Timoneda, Markus Herb, Fabian Duerr, Daniel Goehring

Comments Accepted version. Final version is published in IEEE Robotics and Automation Letters, DOI: 10.1109/LRA.2026.3665447

2602.23890 2026-03-02 cs.CV

DACESR: Degradation-Aware Conditional Embedding for Real-World Image Super-Resolution

Xiaoyan Lei, Wenlong Zhang, Biao Luo, Hui Liang, Weifeng Cao, Qiuting Lin

Comments Accepted by TIP

2602.23880 2026-03-02 cs.LG

A Theory of Random Graph Shift in Truncated-Spectrum vRKHS

Zhang Wan, Tingting Mu, Samuel Kaski

2602.23876 2026-03-02 cs.AI cs.LG

RF-Agent: Automated Reward Function Design via Language Agent Tree Search

Ning Gao, Xiuhui Zhang, Xingyu Jiang, Mukang You, Mohan Zhang, Yue Deng

Comments 39 pages, 9 tables, 11 figures, Project page see https://github.com/deng-ai-lab/RF-Agent

2602.23871 2026-03-02 cs.CV cs.LG

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

Faisal Hawladera, Rui Meireles, Gamal Elghazaly, Ana Aguiar, Raphaël Frank

2602.23870 2026-03-02 cs.RO

Hybrid Offline-Online Reinforcement Learning for Sensorless, High-Precision Force Regulation in Surgical Robotic Grasping

Edoardo Fazzari, Omar Mohamed, Khalfan Hableel, Hamdan Alhadhrami, Cesare Stefanini

详情

英文摘要

Precise grasp force regulation in tendon-driven surgical instruments is fundamentally limited by nonlinear coupling between motor dynamics, transmission compliance, friction, and distal mechanics. Existing solutions typically rely on distal force sensing or analytical compensation, increasing hardware complexity or degrading performance under dynamic motion. We present a sensorless control framework that combines physics-consistent modeling and hybrid reinforcement learning to achieve high-precision distal force regulation in a proximally actuated surgical end-effector. We develop a first-principles digital twin of the da Vinci Xi grasping mechanism that captures coupled electrical, transmission, and jaw dynamics within a unified differential-algebraic formulation. To safely learn control policies in this stiff and highly nonlinear system, we introduce a three-stage pipeline:(i)a receding-horizon CMA-ES oracle that generates dynamically feasible expert trajectories,(ii)fully offline policy learning via Implicit Q-Learning to ensure stable initialization without unsafe exploration, and (iii)online refinement using TD3 for adaptation to on-policy dynamics. The resulting policy directly maps proximal measurements to motor voltages and requires no distal sensing. In simulation, the controller maintains grasp force within 1% of the desired reference during multi-harmonic jaw motion. Hardware experiments demonstrate average force errors below 4% across diverse trajectories, validating sim-to-real transfer. The learned policy contains approximately 71k param and executes at kH rates, enabling real-time deployment. These results demonstrate that high-fidelity modeling combined with structured offline-online RL can recover precise distal force behavior without additional sensing, offering a scalable and mechanically compatible solution for surgical robotic manipulation.

URL PDF HTML ☆

赞 0 踩 0

2602.23869 2026-03-02 cs.CV

Open-Vocabulary Semantic Segmentation in Remote Sensing via Hierarchical Attention Masking and Model Composition

Mohammadreza Heidarianbaei, Mareike Dorozynski, Hubert Kanyamahanga, Max Mehltretter, Franz Rottensteiner

Comments Published in the proceedings of the British Machine Vision Conference Workshops 2025

2602.23864 2026-03-02 cs.AI

RUMAD: Reinforcement-Unifying Multi-Agent Debate

Chao Wang, Han Lin, Huaze Tang, Huijing Lin, Wenbo Ding

Comments 13 pages, 3 figures

2602.23863 2026-03-02 cs.CV cs.CL

NAU-QMUL: Utilizing BERT and CLIP for Multi-modal AI-Generated Image Detection

Xiaoyu Guo, Arkaitz Zubiaga

2602.23852 2026-03-02 cs.LG eess.SP

ULW-SleepNet: An Ultra-Lightweight Network for Multimodal Sleep Stage Scoring

Zhaowen Wang, Dongdong Zhou, Qi Xu, Fengyu Cong, Mohammad Al-Sa'd, Jenni Raitoharju

Comments Accepted to ICASSP 2026

2602.23843 2026-03-02 cs.RO

OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

Yunshen Wang, Shaohang Zhu, Peiyuan Zhi, Yuhan Li, Jiaxin Li, Yong-Lu Li, Yuchen Xiao, Xingxing Wang, Baoxiong Jia, Siyuan Huang

2602.23832 2026-03-02 cs.RO

OmniTrack: General Motion Tracking via Physics-Consistent Reference

Yuhan Li, Peiyuan Zhi, Yunshen Wang, Tengyu Liu, Sixu Yan, Wenyu Liu, Xinggang Wang, Baoxiong Jia, Siyuan Huang

Comments website: https://omnitrack-humanoid.github.io/

2602.23826 2026-03-02 cs.CL cs.LG

GLUScope: A Tool for Analyzing GLU Neurons in Transformer Language Models

Sebastian Gerstner, Hinrich Schütze

Comments 6 pages for main body, 9 pages in total. 4 figures

2602.23824 2026-03-02 cs.LG

Inferring Chronic Treatment Onset from ePrescription Data: A Renewal Process Approach

Pavlin G. Poličar, Dalibor Stanimirović, Blaž Zupan

2602.23821 2026-03-02 cs.RO

Acceleration-Based Control of Fixed-Wing UAVs for Guidance Applications

Jixiang Wang, Siyuan Yang, Ziyi Wu, Siqi Wei, Ashay Wakode, Agata Barcis, Hung Nguyen, Shaoming He

2602.23820 2026-03-02 cs.CV

Denoising-Enhanced YOLO for Robust SAR Ship Detection

Xiaojing Zhao, Shiyang Li, Zena Chu, Ying Zhang, Peinan Hao, Tianzi Yan, Jiajia Chen, Huicong Ning

2602.23817 2026-03-02 cs.CV

Footprint-Guided Exemplar-Free Continual Histopathology Report Generation

Pratibha Kumari, Daniel Reisenbüchler, Afshin Bozorgpour, yousef Sadegheih, Priyankar Choudhary, Dorit Merhof

2602.23816 2026-03-02 cs.LG cs.AI

Learning to maintain safety through expert demonstrations in settings with unknown constraints: A Q-learning perspective

George Papadopoulos, George A. Vouros

Comments Accepted for publication at AAMAS 2026

2602.23814 2026-03-02 cs.CV

Action-Geometry Prediction with 3D Geometric Prior for Bimanual Manipulation

Chongyang Xu, Haipeng Li, Shen Cheng, Jingyu Hu, Haoqiang Fan, Ziliang Feng, Shuaicheng Liu

Comments Accepted by CVPR 2026

2602.23806 2026-03-02 cs.CV cs.AI

See, Act, Adapt: Active Perception for Unsupervised Cross-Domain Visual Adaptation via Personalized VLM-Guided Agent

Tianci Tang, Tielong Cai, Hongwei Wang, Gaoang Wang

2602.23804 2026-03-02 cs.LG

Actor-Critic Pretraining for Proximal Policy Optimization

Andreas Kernbach, Amr Elsheikh, Nicolas Grupp, René Nagel, Marco F. Huber

2602.23802 2026-03-02 cs.AI cs.CV

EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models

Yiyang Fang, Wenke Huang, Pei Fu, Yihao Yang, Kehua Su, Zhenbo Luo, Jian Luan, Mang Ye

Comments Accepted by CVPR 2026

2602.23792 2026-03-02 cs.CL

Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding

Xiangzhong Luo, Yilin An, Zhicheng Yu, Weichen Liu, Xu Yang

Comments 11 pages, 7 figures

2602.23789 2026-03-02 cs.LG cs.AI

UPath: Universal Planner Across Topological Heterogeneity For Grid-Based Pathfinding

Aleksandr Ananikian, Daniil Drozdov, Konstantin Yakovlev

2602.23785 2026-03-02 cs.LG

Provable Subspace Identification of Nonlinear Multi-view CCA

Zhiwei Han, Stefan Matthes, Hao Shen

2602.23784 2026-03-02 cs.LG cs.AI q-fin.CP q-fin.TR

TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure

Maxime Kawawa-Beaudan, Srijan Sood, Kassiani Papasotiriou, Daniel Borrajo, Manuela Veloso

Comments 29 pages, 17 figures, 6 tables. Preprint

2602.23777 2026-03-02 cs.AI

Reasoning-Driven Multimodal LLM for Domain Generalization

Zhipeng Xu, Zilong Wang, Xinyang Jiang, Dongsheng Li, De Cheng, Nannan Wang

Comments Accepted at ICLR 2026 (Poster)