arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.21064 2026-02-25 cs.AI cs.CV cs.LG

Motivation is Something You Need

Mehdi Acheli, Walid Gaaloul

详情

英文摘要

This work introduces a novel training paradigm that draws from affective neuroscience. Inspired by the interplay of emotions and cognition in the human brain and more specifically the SEEKING motivational state, we design a dual-model framework where a smaller base model is trained continuously, while a larger motivated model is activated intermittently during predefined "motivation conditions". The framework mimics the emotional state of high curiosity and anticipation of reward in which broader brain regions are recruited to enhance cognitive performance. Exploiting scalable architectures where larger models extend smaller ones, our method enables shared weight updates and selective expansion of network capacity during noteworthy training steps. Empirical evaluation on the image classification task demonstrates that, not only does the alternating training scheme efficiently and effectively enhance the base model compared to a traditional scheme, in some cases, the motivational model also surpasses its standalone counterpart despite seeing less data per epoch. This opens the possibility of simultaneously training two models tailored to different deployment constraints with competitive or superior performance while keeping training cost lower than when training the larger model.

URL PDF HTML ☆

赞 0 踩 0

2602.21061 2026-02-25 cs.AI

Tool Building as a Path to "Superintelligence"

David Koplow, Tomer Galanti, Tomaso Poggio

2602.21054 2026-02-25 cs.CV cs.AI cs.CL

VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation

Seongheon Park, Changdae Oh, Hyeong Kyu Choi, Xuefeng Du, Sharon Li

2602.21053 2026-02-25 cs.CV

OCR-Agent: Agentic OCR with Capability and Memory Reflection

Shimin Wen, Zeyu Zhang, Xingdou Bian, Hongjie Zhu, Lulu He, Layi Shama, Daji Ergu, Ying Cai

2602.21046 2026-02-25 cs.LG

PIME: Prototype-based Interpretable MCTS-Enhanced Brain Network Analysis for Disorder Diagnosis

Kunyu Zhang, Yanwu Yang, Jing Zhang, Xiangjie Shi, Shujian Yu

2602.21044 2026-02-25 cs.AI

LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification

Yanrui Wu, Lingling Zhang, Xinyu Zhang, Jiayu Chang, Pengyu Li, Xu Jiang, Jingtao Hu, Jun Liu

Comments 24 pages, 17 figures

2602.21042 2026-02-25 cs.CV

OmniOCR: Generalist OCR for Ethnic Minority Languages

Bonan Liu, Zeyu Zhang, Bingbing Meng, Han Wang, Hanshuo Zhang, Chengping Wang, Daji Ergu, Ying Cai

2602.21035 2026-02-25 cs.CV cs.MM

Not Just What's There: Enabling CLIP to Comprehend Negated Visual Descriptions Without Fine-tuning

Junhao Xiao, Zhiyu Wu, Hao Lin, Yi Chen, Yahui Liu, Xiaoran Zhao, Zixu Wang, Zejiang He

2602.21033 2026-02-25 cs.CV cs.AI cs.LG cs.SE

MIP Candy: A Modular PyTorch Framework for Medical Image Processing

Tianhao Fu, Yucheng Chen

2602.21028 2026-02-25 cs.RO

Surface-based Manipulation Using Tunable Compliant Porous-Elastic Soft Sensing

Gayatri Indukumar, Muhammad Awais, Diana Cafiso, Matteo Lo Preti, Lucia Beccai

Comments 6 pages, 6 figures, 1 table, to be published in RoboSoft 2026 proceedings

2602.21020 2026-02-25 cs.LG cs.GT cs.MA

Matching Multiple Experts: On the Exploitability of Multi-Agent Imitation Learning

Antoine Bergerault, Volkan Cevher, Negar Mehr

2602.21015 2026-02-25 cs.CV

From Perception to Action: An Interactive Benchmark for Vision Reasoning

Yuhao Wu, Maojia Song, Yihuai Lan, Lei Wang, Zhiqiang Hu, Yao Xiao, Heng Zhou, Weihua Zheng, Dylan Raharja, Soujanya Poria, Roy Ka-Wei Lee

Comments Work in processing. Website: https://social-ai-studio.github.io/CHAIN/

2602.21010 2026-02-25 cs.CV

Le-DETR: Revisiting Real-Time Detection Transformer with Efficient Encoder Design

Jiannan Huang, Aditya Kane, Fengzhe Zhou, Yunchao Wei, Humphrey Shi

Comments CVPR Findings

详情

英文摘要

Real-time object detection is crucial for real-world applications as it requires high accuracy with low latency. While Detection Transformers (DETR) have demonstrated significant performance improvements, current real-time DETR models are challenging to reproduce from scratch due to excessive pre-training overheads on the backbone, constraining research advancements by hindering the exploration of novel backbone architectures. In this paper, we want to show that by using general good design, it is possible to have \textbf{high performance} with \textbf{low pre-training cost}. After a thorough study of the backbone architecture, we propose EfficientNAT at various scales, which incorporates modern efficient convolution and local attention mechanisms. Moreover, we re-design the hybrid encoder with local attention, significantly enhancing both performance and inference speed. Based on these advancements, we present Le-DETR (\textbf{L}ow-cost and \textbf{E}fficient \textbf{DE}tection \textbf{TR}ansformer), which achieves a new \textbf{SOTA} in real-time detection using only ImageNet1K and COCO2017 training datasets, saving about 80\% images in pre-training stage compared with previous methods. We demonstrate that with well-designed, real-time DETR models can achieve strong performance without the need for complex and computationally expensive pretraining. Extensive experiments show that Le-DETR-M/L/X achieves \textbf{52.9/54.3/55.1 mAP} on COCO Val2017 with \textbf{4.45/5.01/6.68 ms} on an RTX4090. It surpasses YOLOv12-L/X by \textbf{+0.6/-0.1 mAP} while achieving similar speed and \textbf{+20\%} speedup. Compared with DEIM-D-FINE, Le-DETR-M achieves \textbf{+0.2 mAP} with slightly faster inference, and surpasses DEIM-D-FINE-L by \textbf{+0.4 mAP} with only \textbf{0.4 ms} additional latency. Code and weights will be open-sourced.

URL PDF HTML ☆

赞 0 踩 0

2602.20976 2026-02-25 cs.CL cs.CY

Evaluating Proactive Risk Awareness of Large Language Models

Xuan Luo, Yubin Chen, Zhiyu Hou, Linpu Yu, Geng Tu, Jing Li, Ruifeng Xu

2602.20973 2026-02-25 cs.CL

Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving

Yuliang Ji, Fuchen Shen, Jian Wu, Qiujie Xie, Yue Zhang

2602.20972 2026-02-25 cs.CV

Are Multimodal Large Language Models Good Annotators for Image Tagging?

Ming-Kun Xie, Jia-Hao Xiao, Zhiqiang Kou, Zhongnian Li, Gang Niu, Masashi Sugiyama

2602.20966 2026-02-25 cs.CL

Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models

Paola Merlo, Chunyang Jiang, Giuseppe Samo, Vivi Nastase

Comments Under review, 46 pages, 5 tables, 28 figures

2602.20947 2026-02-25 cs.LG cs.CV

Estimation of Confidence Bounds in Binary Classification using Wilson Score Kernel Density Estimation

Thorbjørn Mosekjær Iversen, Zebin Duan, Frederik Hagelskjær

2602.20943 2026-02-25 cs.CV

UFO: Unifying Feed-Forward and Optimization-based Methods for Large Driving Scene Modeling

Kaiyuan Tan, Yingying Shen, Mingfei Tu, Haohui Zhu, Bing Wang, Guang Chen, Hangjun Ye, Haiyang Sun

2602.20937 2026-02-25 cs.LG

Extending $μ$P: Spectral Conditions for Feature Learning Across Optimizers

Akshita Gupta, Marieme Ngom, Sam Foreman, Venkatram Vishwanath

Comments 10 main pages, 16 appendix pages and 17 figures; Amended version of the publication in 17th International OPT Workshop on Optimization for Machine Learning

2602.20934 2026-02-25 cs.AI

Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence

ChengYou Li, XiaoDong Liu, XiangBao Meng, XinYu Zhao

Comments 16 pages,9 figures

2602.20933 2026-02-25 cs.CV

Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting

Shuangkang Fang, I-Chao Shen, Xuanyang Zhang, Zesheng Wang, Yufeng Wang, Wenrui Ding, Gang Yu, Takeo Igarashi

Comments Accepted by CVPR 2026

2602.20932 2026-02-25 cs.LG cs.HC eess.SP

Hierarchic-EEG2Text: Assessing EEG-To-Text Decoding across Hierarchical Abstraction Levels

Anupam Sharma, Harish Katti, Prajwal Singh, Shanmuganathan Raman, Krishna Miyapuram

2602.20926 2026-02-25 cs.AI

HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG

Yuqi Huang, Ning Liao, Kai Yang, Anning Hu, Shengchao Hu, Xiaoxing Wang, Junchi Yan

2602.20925 2026-02-25 cs.RO cs.CV

LST-SLAM: A Stereo Thermal SLAM System for Kilometer-Scale Dynamic Environments

Zeyu Jiang, Kuan Xu, Changhao Chen

Comments ICRA 2026

2602.20923 2026-02-25 cs.RO

ParkDiffusion++: Ego Intention Conditioned Joint Multi-Agent Trajectory Prediction for Automated Parking using Diffusion Models

Jiarong Wei, Anna Rehr, Christian Feist, Abhinav Valada

Comments ICRA 2026 Camera Ready Version

2602.20921 2026-02-25 cs.LG

On the Generalization Behavior of Deep Residual Networks From a Dynamical System Perspective

Jinshu Huang, Mingfei Sun, Chunlin Wu

2602.20920 2026-02-25 cs.RO

Computer-Aided Design of Rational Motions for 4R and 6R Spatial Mechanism Synthesis

Daniel Huczala, Severinas Zube, Martin Pfurner, Johannes Siegele, Frank C. Park

2602.20918 2026-02-25 cs.AI cs.CL

Predicting Sentence Acceptability Judgments in Multimodal Contexts

Hyewon Jang, Nikolai Ilinykh, Sharid Loáiciga, Jey Han Lau, Shalom Lappin

2602.20915 2026-02-25 cs.RO

Task-oriented grasping for dexterous robots using postural synergies and reinforcement learning

Dimitrios Dimou, José Santos-Victor, Plinio Moreno