arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.17355 2026-03-09 cs.CV

UAM: A Unified Attention-Mamba Backbone of Multimodal Framework for Tumor Cell Classification

Taixi Chen, Jingyun Chen, Nancy Guo

详情

英文摘要

Inspired by the recent success of the Mamba architecture in vision and language domains, we introduce a Unified Attention-Mamba (UAM) backbone. Unlike previous hybrid approaches that integrate Attention and Mamba modules in fixed proportions, our unified design flexibly combines their capabilities within a single cohesive architecture, eliminating the need for manual ratio tuning and improving encode capability. We develop two UAM variants to comprehensively evaluate the benefits of this unified structure. Building on this backbone, we further propose a multimodal UAM framework that jointly performs cell-level classification and image segmentation. Experimental results demonstrate that UAM achieves state-of-the-art performance across both tasks on public benchmarks, surpassing leading image-based foundation models. It improves cell classification accuracy from 74\% to 78\% ($n$=349,882 cells), and tumor segmentation precision from 75\% to 80\% ($n$=406 patches).

URL PDF HTML ☆

赞 0 踩 0

2511.16050 2026-03-09 cs.RO

Bi-AQUA: Bilateral Control-Based Imitation Learning for Underwater Robot Arms via Lighting-Aware Action Chunking with Transformers

Takeru Tsunoori, Masato Kobayashi, Yuki Uranishi

2511.15481 2026-03-09 cs.CV

FunnyNodules: A Customizable Medical Dataset Tailored for Evaluating Explainable AI

Luisa Gallée, Yiheng Xiong, Meinrad Beer, Michael Götz

Comments accepted at Medical Imaging with Deep Learning (MIDL) 2026

2511.15239 2026-03-09 cs.RO cs.MA

Symmetry-Breaking in Multi-Agent Navigation: Winding Number-Aware MPC with a Learned Topological Strategy

Tomoki Nakao, Kazumi Kasaura, Tadashi Kozuno

Comments 12 pages, 7 figures

2511.13232 2026-03-09 cs.CV

MRIQT: Physics-Aware Diffusion Model for Image Quality Transfer in Neonatal Ultra-Low-Field MRI

Malek Al Abed, Sebiha Demir, Anne Groteklaes, Elodie Germani, Shahrooz Faghihroohi, Hemmen Sabir, Shadi Albarqouni

Comments 5 pages, 4 figures

2511.13127 2026-03-09 cs.CV cs.CR

SPARK: Jailbreaking T2V Models by Synergistically Prompting Auditory and Recontextualized Knowledge

Zonghao Ying, Moyang Chen, Nizhang Li, Zhiqiang Wang, Wenxin Zhang, Quanchen Zou, Zonglei Jing, Aishan Liu, Xianglong Liu

2511.11368 2026-03-09 cs.CV

LaxMotion: Rethinking Supervision Granularity for 3D Human Motion Generation

Sheng Liu, Yuanzhi Liang, Sidan Du

2511.10150 2026-03-09 cs.CV

Decoupling Bias, Aligning Distributions: Synergistic Fairness Optimization for Deepfake Detection

Feng Ding, Wenhui Yi, Yunpeng Zhou, Xinan He, Hong Rao, Shu Hu

2511.07722 2026-03-09 cs.CL

Critical Confabulation: Can LLMs Hallucinate for Social Good?

Peiqi Sui, Eamon Duede, Hoyt Long, Richard Jean So

Comments ICLR2026 Camera Ready. 27 pages, 5 figures, 11 tables

2511.07619 2026-03-09 cs.RO

CAVER: Curious Audiovisual Exploring Robot

Luca Macesanu, Boueny Folefack, Samik Singh, Ruchira Ray, Ben Abbatematteo, Roberto Martín-Martín

Comments 9 pages, 6 figures

2511.06202 2026-03-09 cs.RO

ExpReS-VLA: Specializing Vision-Language-Action Models Through Experience Replay and Retrieval

Shahram Najam Syed, Yatharth Ahuja, Arthur Jakobsson, Jeff Ichnowski

Comments 8 pages, 4 figures, 3 tables, accepted to International Conference on Robotics and Automation (ICRA) 2026

2511.05826 2026-03-09 cs.LG stat.ML

CADM: Cluster-customized Adaptive Distance Metric for Categorical Data Clustering

Taixi Chen, Yiu-ming Cheung, Yiqun Zhang

Comments Accepted by ICASSP 2026

2511.05664 2026-03-09 cs.LG

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung, Youngrok Park, Se-Young Yun

Comments NeurIPS 2025 Spotlight. Code: https://github.com/shkim0116/KLASS

2511.03738 2026-03-09 cs.CL

Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs

Pranav Bhandari, Nicolas Fay, Sanjeevan Selvaganapathy, Amitava Datta, Usman Naseem, Mehwish Nasim

Comments Accepted to EACL 2026

2511.03550 2026-03-09 cs.RO cs.HC

Indicating Robot Vision Capabilities with Augmented Reality

Hong Wang, Ridhima Phatak, James Ocampo, Zhao Han

详情

DOI: 10.1007/s12369-026-01368-0
Journal ref: International Journal of Social Robotics (2026)

英文摘要

Research indicates that humans can mistakenly assume that robots and humans have the same field of view, possessing an inaccurate mental model of robots. This misperception may lead to failures during human-robot collaboration tasks where robots might be asked to complete impossible tasks about out-of-view objects. The issue is more severe when robots do not have a chance to scan the scene to update their world model while focusing on assigned tasks. To help align humans' mental models of robots' vision capabilities, we propose four field-of-view indicators in augmented reality and conducted a human-subjects experiment (N=41) to evaluate them in a collaborative assembly task regarding accuracy, confidence, task efficiency, and workload. These indicators span a spectrum of positions: two at robot's eye and head space -- deepening eye socket and adding blocks to two sides of the eyes (i.e., egocentric), and two anchoring in the robot's task space -- adding extended blocks from the sides of eyes to the table and placing blocks directly on the tables (i.e., allocentric). Results showed that, when placed directly in the task space, the allocentric indicator yields the highest accuracy, although with a delay in interpreting the robot's field of view. When placed at the robot's eyes, the egocentric indicator of deeper eye sockets, possible for physical alteration, also increased accuracy. In all indicators, participants' confidence was high while cognitive load remained low. Finally, we contribute six guidelines for practitioners to apply our augmented reality indicators or physical alterations to align humans' mental models with robots' vision capabilities.

URL PDF HTML ☆

赞 0 踩 0

2511.00814 2026-03-09 cs.RO cs.LG cs.SY eess.SY

Real-Time Learning of Predictive Dynamic Obstacle Models for Robotic Motion Planning

Stella Kombo, Masih Haseli, Skylar X. Wei, Joel W. Burdick

Comments 10 pages, 6 figures, submitted to IEEE International Conference on Robotics and Automation (ICRA) 2025

2510.23896 2026-03-09 cs.CL

AfriMTEB and AfriE5: Benchmarking and Adapting Text Embedding Models for African Languages

Kosei Uemura, Miaoran Zhang, David Ifeoluwa Adelani

Comments Accepted to EACL 2026 (main conference)

2510.21536 2026-03-09 cs.RO cs.CV

AURASeg: Attention-guided Upsampling with Residual-Assistive Boundary Refinement for Onboard Robot Drivable-Area Segmentation

Narendhiran Vijayakumar, Sridevi. M

Comments 6 pages, 4 figures, 4 tables

2510.19974 2026-03-09 cs.RO

Push Anything: Single- and Multi-Object Pushing From First Sight with Contact-Implicit MPC

Hien Bui, Yufeiyang Gao, Haoran Yang, Eric Cui, Siddhant Mody, Brian Acosta, Thomas Stephen Felix, Bibit Bianchini, Michael Posa

Comments Presented at ICRA 2026; 8 pages, 8 figures. Hien Bui, Yufeiyang Gao, and Haoran Yang contributed equally to this work

2510.19074 2026-03-09 cs.RO

Sample-Based Hybrid Mode Control: Asymptotically Optimal Switching of Algorithmic and Non-Differentiable Control Modes

Yilang Liu, Haoxiang You, Ian Abraham

2510.18077 2026-03-09 cs.CL

Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models

Shabnam Ataee, Hugo Huart, Andrei Popescu-Belis

Comments Proceedings of LREC 2026

2510.11689 2026-03-09 cs.RO cs.AI

Phys2Real: Fusing VLM Priors with Interactive Online Adaptation for Uncertainty-Aware Sim-to-Real Manipulation

Maggie Wang, Stephen Tian, Aiden Swann, Ola Shorinwa, Jiajun Wu, Mac Schwager

Comments Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2026

2510.11512 2026-03-09 cs.CV cs.AI

LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference

Jianhao Yuan, Fabio Pizzati, Francesco Pinto, Lars Kunze, Ivan Laptev, Paul Newman, Philip Torr, Daniele De Martini

Comments 23 pages, 9 figures, Project Page: https://yuanjianhao508.github.io/LikePhys/

2510.09173 2026-03-09 cs.CV

Beyond Flat Unknown Labels in Open-World Object Detection

Yuchen Zhang, Yao Lu, Johannes Betz

Comments 8 pages, 3 figures

2510.08023 2026-03-09 cs.LG

Do We Really Need Permutations? Impact of Model Width on Linear Mode Connectivity

Akira Ito, Masanori Yamada, Daiki Chijiwa, Atsutoshi Kumagai

Comments Accepted to the Fourteenth International Conference on Learning Representations (ICLR 2026). OpenReview: https://openreview.net/forum?id=ll8GLAic7q

2510.05278 2026-03-09 cs.LG cs.CL

Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs

Paloma García-de-Herreros, Philipp Slusallek, Dietrich Klakow, Vagrant Gautam

Comments ICLR 2026 Workshop on AI and Partial Differential Equations

2510.00803 2026-03-09 cs.LG cs.SI

Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits

Federico Cinus, Yuko Kuroki, Atsushi Miyauchi, Francesco Bonchi

Comments Accepted at ICLR 2026

2510.00502 2026-03-09 cs.LG

Diffusion Alignment as Variational Expectation-Maximization

Jaewoo Lee, Minsu Kim, Sanghyeok Choi, Inhyuck Song, Sujin Yun, Hyeongyu Kang, Woocheol Shin, Taeyoung Yun, Kiyoung Om, Jinkyoo Park

Comments ICLR 2026

2509.23405 2026-03-09 cs.LG

Planner Aware Path Learning in Diffusion Language Models Training

Fred Zhangzhi Peng, Zachary Bezemek, Jarrid Rector-Brooks, Shuibai Zhang, Anru R. Zhang, Michael Bronstein, Alexander Tong, Avishek Joey Bose

Comments Camera ready version for ICLR2026

2509.23335 2026-03-09 cs.CV

DeCLIP: Decoupled Prompting for CLIP-based Multi-Label Class-Incremental Learning

Kaile Du, Zihan Ye, Junzhou Xie, Yixi Shen, Yuyang Li, Fuyuan Hu, Ling Shao, Guangcan Liu, Joost van de Weijer, Fan Lyu