arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.09576 2026-03-11 cs.LG cs.AI

Routing without Forgetting

Alessio Masano, Giovanni Bellitto, Dipam Goswani, Joost Van de Weijer, Concetto Spampinato

详情

英文摘要

Continual learning in transformers is commonly addressed through parameter-efficient adaptation: prompts, adapters, or LoRA modules are specialized per task while the backbone remains frozen. Although effective in controlled multi-epoch settings, these approaches rely on gradual gradient-based specialization and struggle in Online Continual Learning (OCL), where data arrive as a non-stationary stream and each sample may be observed only once. We recast continual learning in transformers as a routing problem: under strict online constraints, the model must dynamically select the appropriate representational subspace for each input without explicit task identifiers or repeated optimization. We thus introduce Routing without Forgetting (RwF), a transformer architecture augmented with energy-based associative retrieval layers inspired by Modern Hopfield Networks. Instead of storing or merging task-specific prompts, RwF generates dynamic prompts through single-step associative retrieval over the transformer token embeddings at each layer. Retrieval corresponds to the closed-form minimization of a strictly convex free-energy functional, enabling input-conditioned routing within each forward pass, independently of iterative gradient refinement. Across challenging class-incremental benchmarks, RwF improves over existing prompt-based methods. On Split-ImageNet-R and Split-ImageNet-S, RwF outperforms prior prompt-based approaches by a large margin, even in few-shot learning regimes. These results indicate that embedding energy-based associative routing directly within the transformer backbone provides a principled and effective foundation for OCL.

URL PDF HTML ☆

赞 0 踩 0

2603.09574 2026-03-11 cs.RO cs.LG

SCDP: Learning Humanoid Locomotion from Partial Observations via Mixed-Observation Distillation

Milo Carroll, Tianhu Peng, Lingfan Bao, Chengxu Zhou, Zhibin Li

Comments 6 pages, 8 figures, 5 tables, iRos

2603.09571 2026-03-11 cs.LG math.OC

An Optimal Control Approach To Transformer Training

Kağan Akman, Naci Saldı, Serdar Yüksel

2603.09566 2026-03-11 cs.CV

GeoAlignCLIP: Enhancing Fine-Grained Vision-Language Alignment in Remote Sensing via Multi-Granular Consistency Learning

Xiao Yang, Ronghao Fu, Zhuoran Duan, Zhiwen Lin, Xueyan Liu, Bo Yang

2603.09563 2026-03-11 cs.LG

Learning Bayesian and Markov Networks with an Unreliable Oracle

Juha Harviainen, Pekka Parviainen, Vidya Sagar Sharma

2603.09557 2026-03-11 cs.RO

Trajectory Optimization for Self-Wrap-Aware Cable-Towed Planar Object Manipulation under Implicit Tension Constraints

Yu Li, Amin Fakhari, Hamid Sadeghian

2603.09556 2026-03-11 cs.CL

ALARM: Audio-Language Alignment for Reasoning Models

Petr Grinberg, Hassan Shahmohammadi

Comments Submitted to Interspeech2026

2603.09552 2026-03-11 cs.RO

On the Cost of Evolving Task Specialization in Multi-Robot Systems

Paolo Leopardi, Heiko Hamann, Jonas Kuckling, Tanja Katharina Kaiser

Comments Accepted for publication in the proceeding of ANTS 2026 - 15th International Conference on Swarm Intelligence

2603.09548 2026-03-11 cs.CV cs.GR

A comprehensive study of time-of-flight non-line-of-sight imaging

Julio Marco, Adrian Jarabo, Ji Hyun Nam, Alberto Tosi, Diego Gutierrez, Andreas Velten

2603.09542 2026-03-11 cs.RO

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models

Ziyue Zhu, Shangyang Wu, Shuai Zhao, Zhiqiu Zhao, Shengjie Li, Yi Wang, Fang Li, Haoran Luo

2603.09541 2026-03-11 cs.CV cs.MM

Memory-Guided View Refinement for Dynamic Human-in-the-loop EQA

Xin Lu, Rui Li, Xun Huang, Weixin Li, Chuanqing Zhuang, Jiayuan Li, Zhengda Lu, Jun Xiao, Yunhong Wang

2603.09538 2026-03-11 cs.CV

Towards Unified Multimodal Interleaved Generation via Group Relative Policy Optimization

Ming Nie, Chunwei Wang, Jianhua Han, Hang Xu, Li Zhang

2603.09533 2026-03-11 cs.AI cs.CL

Enhancing Debunking Effectiveness through LLM-based Personality Adaptation

Pietro Dell'Oglio, Alessandro Bondielli, Francesco Marcelloni, Lucia C. Passaro

Comments In: Computational Intelligence. IJCCI 2025. Springer, Cham (2026)

2603.09530 2026-03-11 cs.CV

DCAU-Net: Differential Cross Attention and Channel-Spatial Feature Fusion for Medical Image Segmentation

Yanxin Li, Hui Wan, Libin Lan

Comments Submitted to IJCNN 2026, 6 pages, 5 tables, 4 figures

2603.09527 2026-03-11 cs.LG cs.AI

Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation

Luxi Lin, Zhihang Lin, Zhanpeng Zeng, Yuhao Chen, Qingyu Zhang, Jixiang Luo, Xuelong Li, Rongrong Ji

Comments 10 pages

2603.09517 2026-03-11 cs.CL cs.LG

You Didn't Have to Say It like That: Subliminal Learning from Faithful Paraphrases

Isaia Gisler, Zhonghao He, Tianyi Qiu

Comments Accepted for Spotlight presentation at EACL 2026 SRW. 5 pages, 2 figures, plus appendix. Equal supervision by Zhonghao He and Tianyi Qiu

2603.09512 2026-03-11 cs.CV

Probing the Reliability of Driving VLMs: From Inconsistent Responses to Grounded Temporal Reasoning

Chun-Peng Chang, Chen-Yu Wang, Holger Caesar, Alain Pagani

2603.09503 2026-03-11 cs.CL

Modelling the Diachronic Emergence of Phoneme Frequency Distributions

Fermín Moscoso del Prado Martín, Suchir Salhan

2603.09496 2026-03-11 cs.CV

SurgFed: Language-guided Multi-Task Federated Learning for Surgical Video Understanding

Zheng Fang, Ziwei Niu, Ziyue Wang, Zhu Zhuo, Haofeng Liu, Shuyang Qian, Jun Xia, Yueming Jin

2603.09490 2026-03-11 cs.LG cs.AI

Temporal-Conditioned Normalizing Flows for Multivariate Time Series Anomaly Detection

David Baumgartner, Helge Langseth, Kenth Engø-Monsen, Heri Ramampiaro

2603.09486 2026-03-11 cs.AI

Vibe-Creation: The Epistemology of Human-AI Emergent Cognition

Ilya Levin

Comments 11 pages, 1 fugure

2603.09484 2026-03-11 cs.CV

Component-Aware Sketch-to-Image Generation Using Self-Attention Encoding and Coordinate-Preserving Fusion

Ali Zia, Muhammad Umer Ramzan, Usman Ali, Muhammad Faheem, Abdelwahed Khamis, Shahnawaz Qureshi

2603.09482 2026-03-11 cs.RO

StyleVLA: Driving Style-Aware Vision Language Action Model for Autonomous Driving

Yuan Gao, Dengyuan Hua, Mattia Piccinini, Finn Rasmus Schäfer, Korbinian Moller, Lin Li, Johannes Betz

Comments 8 pages

2603.09481 2026-03-11 cs.AI

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore

Comments 54 pages, 4 figures. Accepted to ICAPS 2026

2603.09476 2026-03-11 cs.AI

Telogenesis: Goal Is All U Need

Zhuoran Deng, Yizhi Zhang, Ziyi Zhang, Wan Shen

Comments 6 pages, 3 figures, submitted to ALIFE 2026

2603.09471 2026-03-11 cs.CV

OmniEarth: A Benchmark for Evaluating Vision-Language Models in Geospatial Tasks

Ronghao Fu, Haoran Liu, Weijie Zhang, Zhiwen Lin, Xiao Yang, Peng Zhang, Bo Yang

2603.09470 2026-03-11 cs.CV

The Patrologia Graeca Corpus: OCR, Annotation, and Open Release of Noisy Nineteenth-Century Polytonic Greek Editions

Chahan Vidal-Gorène, Bastien Kindt

2603.09466 2026-03-11 cs.CV

TopoOR: A Unified Topological Scene Representation for the Operating Room

Tony Danjun Wang, Ka Young Kim, Tolga Birdal, Nassir Navab, Lennart Bastian

2603.09463 2026-03-11 cs.AI

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Yuan Cao, Dezhi Ran, Yuzhe Guo, Mengzhou Wu, Simin Chen, Linyi Li, Wei Yang, Tao Xie

2603.09460 2026-03-11 cs.RO

SEA-Nav: Efficient Policy Learning for Safe and Agile Quadruped Navigation in Cluttered Environments

Shiyi Chen, Mingye Yang, Haiyan Mao, Jiaqi Zhang, Haiyi Liu, Shuheng He, Debing Zhang, Zihao Qiu, Chun Zhang

Comments Project website: https://11chens.github.io/sea-nav/