arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.22685 2026-02-27 cs.LG

Switch-Hurdle: A MoE Encoder with AR Hurdle Decoder for Intermittent Demand Forecasting

Fabian Muşat, Simona Căbuz

详情

英文摘要

Intermittent demand, a pattern characterized by long sequences of zero sales punctuated by sporadic, non-zero values, poses a persistent challenge in retail and supply chain forecasting. Both traditional methods, such as ARIMA, exponential smoothing, or Croston variants, as well as modern neural architectures such as DeepAR and Transformer-based models often underperform on such data, as they treat demand as a single continuous process or become computationally expensive when scaled across many sparse series. To address these limitations, we introduce Switch-Hurdle: a new framework that integrates a Mixture-of-Experts (MoE) encoder with a Hurdle-based probabilistic decoder. The encoder uses a sparse Top-1 expert routing during the forward pass yet approximately dense in the backward pass via a straight-through estimator (STE). The decoder follows a cross-attention autoregressive design with a shared hurdle head that explicitly separates the forecasting task into two components: a binary classification component estimating the probability of a sale, and a conditional regression component, predicting the quantity given a sale. This structured separation enables the model to capture both occurrence and magnitude processes inherent to intermittent demand. Empirical results on the M5 benchmark and a large proprietary retail dataset show that Switch-Hurdle achieves state-of-the-art prediction performance while maintaining scalability.

URL PDF HTML ☆

赞 0 踩 0

2602.22681 2026-02-27 cs.LG

Accelerating LLM Pre-Training through Flat-Direction Dynamics Enhancement

Shuchen Zhu, Rizhen Hu, Mingze Wang, Mou Sun, Xue Wang, Kun Yuan, Zaiwen Wen

2602.22678 2026-02-27 cs.CV cs.AI

ViCLIP-OT: The First Foundation Vision-Language Model for Vietnamese Image-Text Retrieval with Optimal Transport

Quoc-Khang Tran, Minh-Thien Nguyen, Nguyen-Khang Pham

Comments Preprint submitted to Expert Systems with Applications

2602.22674 2026-02-27 cs.CV

SPMamba-YOLO: An Underwater Object Detection Network Based on Multi-Scale Feature Enhancement and Global Context Modeling

Guanghao Liao, Zhen Liu, Liyuan Cao, Yonghui Yang, Qi Li

Comments 31 pages, 10 figures, 6 tables. This paper presents SPMamba-YOLO, an underwater object detection framework integrating multi-scale feature enhancement and global context modeling. The work is under review

2602.22671 2026-02-27 cs.RO cs.ET

Does the testing environment matter? Carsickness across on-road, test-track, and driving simulator conditions

Georgios Papaioannou, Barys Shyrokau

2602.22663 2026-02-27 cs.RO

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

Wenxuan Song, Jiayi Chen, Xiaoquan Sun, Huashuo Lei, Yikai Qin, Wei Zhao, Pengxiang Ding, Han Zhao, Tongxin Wang, Pengxu Hou, Zhide Zhong, Haodong Yan, Donglin Wang, Jun Ma, Haoang Li

Comments Accepted by ICRA 2026

2602.22661 2026-02-27 cs.CL cs.AI cs.LG

dLLM: Simple Diffusion Language Modeling

Zhanhui Zhou, Lingjie Chen, Hanghang Tong, Dawn Song

Comments Code available at: https://github.com/ZHZisZZ/dllm

2602.22150 2026-02-27 cs.CV

CoLoGen: Progressive Learning of Concept-Localization Duality for Unified Image Generation

YuXin Song, Yu Lu, Haoyuan Sun, Huanjin Yao, Fanglong Liu, Yifan Sun, Haocheng Feng, Hang Zhou, Jingdong Wang

Comments Accepted by CVPR2026. 15 pages, 8 figures

2602.21893 2026-02-27 cs.CV

EndoDDC: Learning Sparse to Dense Reconstruction for Endoscopic Robotic Navigation via Diffusion Depth Completion

Yinheng Lin, Yiming Huang, Beilei Cui, Long Bai, Huxin Gao, Hongliang Ren, Jiewen Lai

Comments Accepted by ICRA 2026

2602.21585 2026-02-27 cs.LG cs.AI cs.CL stat.ML

Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences

Sweta Karlekar, Carolina Zheng, Magnus Saebo, Nicolas Beltran-Velez, Shuyang Yu, John Bowlan, Michal Kucer, David Blei

2602.21189 2026-02-27 cs.LG cs.AI

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi

Comments updated related work discussion

2602.20963 2026-02-27 cs.RO

A Robotic Testing Platform for Pipelined Discovery of Resilient Soft Actuators

Ang Li, Alexander Yin, Alexander White, Sahib Sandhu, Matthew Francoeur, Victor Jimenez-Santiago, Van Remenar, Codrin Tugui, Mihai Duduta

2602.20903 2026-02-27 cs.CV

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering

Hanshen Zhu, Yuliang Liu, Xuecheng Wu, An-Lan Wang, Hao Feng, Dingkang Yang, Chao Feng, Can Huang, Jingqun Tang, Xiang Bai

Comments Accepted by CVPR 2026; Code: https://github.com/CIawevy/TextPecker

2602.20031 2026-02-27 cs.AI cs.LG

Latent Introspection: Models Can Detect Prior Concept Injections

Theia Pearson-Vogel, Martin Vanek, Raymond Douglas, Jan Kulveit

Comments 28 pages, 17 figures. Submitted to ICML 2026. Workshop version submitted to ICLR 2026 Workshop on Latent and Implicit Thinking

2602.19964 2026-02-27 cs.LG cs.AI math.PR stat.ML

On the Equivalence of Random Network Distillation, Deep Ensembles, and Bayesian Inference

Moritz A. Zanger, Yijun Wu, Pascal R. Van der Vaart, Wendelin Böhmer, Matthijs T. J. Spaan

Comments 8 pages, 1 Figure

2602.19805 2026-02-27 cs.LG cs.AI

Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing

Wall Kim, Chaeyoung Song, Hanul Kim

Comments This work was intended as a replacement of arXiv:2408.10517 and any subsequent updates will appear there

2602.19128 2026-02-27 cs.AI

K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model

Shiyi Cao, Ziming Mao, Joseph E. Gonzalez, Ion Stoica

2602.17072 2026-02-27 cs.CL

BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios

Yunseung Lee, Subin Kim, Youngjun Kwak, Jaegul Choo

Comments LREC 2026

详情

英文摘要

Large language models (LLMs)-based chatbots are increasingly being adopted in the financial domain, particularly in digital banking, to handle customer inquiries about products such as deposits, savings, and loans. However, these models still exhibit low accuracy in core banking computations-including total payout estimation, comparison of products with varying interest rates, and interest calculation under early repayment conditions. Such tasks require multi-step numerical reasoning and contextual understanding of banking products, yet existing LLMs often make systematic errors-misinterpreting product types, applying conditions incorrectly, or failing basic calculations involving exponents and geometric progressions. However, such errors have rarely been captured by existing benchmarks. Mathematical datasets focus on fundamental math problems, whereas financial benchmarks primarily target financial documents, leaving everyday banking scenarios underexplored. To address this limitation, we propose BankMathBench, a domain-specific dataset that reflects realistic banking tasks. BankMathBench is organized in three levels of difficulty-basic, intermediate, and advanced-corresponding to single-product reasoning, multi-product comparison, and multi-condition scenarios, respectively. When trained on BankMathBench, open-source LLMs exhibited notable improvements in both formula generation and numerical reasoning accuracy, demonstrating the dataset's effectiveness in enhancing domain-specific reasoning. With tool-augmented fine-tuning, the models achieved average accuracy increases of 57.6%p (basic), 75.1%p (intermediate), and 62.9%p (advanced), representing significant gains over zero-shot baselines. These findings highlight BankMathBench as a reliable benchmark for evaluating and advancing LLMs' numerical reasoning in real-world banking scenarios.

URL PDF HTML ☆

赞 0 踩 0

2602.15457 2026-02-27 cs.LG

Benchmarking IoT Time-Series AD with Event-Level Augmentations

Dmitry Zhevnenko, Ilya Makarov, Aleksandr Kovalenko, Fedor Meshchaninov, Anton Kozhukhov, Vladislav Travnikov, Makar Ippolitov, Kirill Yashunin, Iurii Katser

Comments https://underline.io/events/521/sessions/21822/lecture/143905-benchmarking-iot-time-series-ad-with-event-level-augmentations?tab=poster

2602.13507 2026-02-27 cs.CV

Benchmarking Video Foundation Models for Remote Parkinson's Disease Screening

Md Saiful Islam, Ekram Hossain, Abdelrahman Abdelkader, Tariq Adnan, Fazla Rabbi Mashrur, Sooyong Park, Praveen Kumar, Qasim Sudais, Natalia Chunga, Nami Shah, Jan Freyberg, Christopher Kanan, Ruth Schneider, Ehsan Hoque

2602.12125 2026-02-27 cs.LG cs.AI cs.CL

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

Wenkai Yang, Weijie Liu, Ruobing Xie, Kai Yang, Saiyong Yang, Yankai Lin

Comments v2, update results under stronger teachers with more RL training steps

详情

英文摘要

On-policy distillation (OPD), which aligns the student with the teacher's logit distribution on student-generated trajectories, has demonstrated strong empirical gains in improving student performance and often outperforms off-policy distillation and reinforcement learning (RL) paradigms. In this work, we first theoretically show that OPD is a special case of dense KL-constrained RL where the reward function and the KL regularization are always weighted equally and the reference model can by any model. Then, we propose the Generalized On-Policy Distillation (G-OPD) framework, which extends the standard OPD objective by introducing a flexible reference model and a reward scaling factor that controls the relative weight of the reward term against the KL regularization. Through comprehensive experiments on math reasoning and code generation tasks, we derive two novel insights: (1) Setting the reward scaling factor to be greater than 1 (i.e., reward extrapolation), which we term ExOPD, consistently improves over standard OPD across a range of teacher-student size pairings. In particular, in the setting where we merge the knowledge from different domain experts, obtained by applying domain-specific RL to the same student model, back into the original student, ExOPD enables the student to even surpass the teacher's performance boundary and outperform the domain teachers. (2) Building on ExOPD, we further find that in the strong-to-weak distillation setting (i.e., distilling a smaller student from a larger teacher), performing reward correction by choosing the reference model as the teacher's base model before RL yields a more accurate reward signal and further improves distillation performance. However, this choice assumes access to the teacher's pre-RL variant and incurs more computational overhead. We hope our work offers new insights for future research on OPD.

URL PDF HTML ☆

赞 0 踩 0

2602.12099 2026-02-27 cs.CV

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

GigaBrain Team, Boyuan Wang, Bohan Li, Chaojun Ni, Guan Huang, Guosheng Zhao, Hao Li, Jie Li, Jindi Lv, Jingyu Liu, Lv Feng, Mingming Yu, Peng Li, Qiuping Deng, Tianze Liu, Xinyu Zhou, Xinze Chen, Xiaofeng Wang, Yang Wang, Yifan Li, Yifei Nie, Yilong Li, Yukun Zhou, Yun Ye, Zhichao Liu, Zheng Zhu

Comments https://gigabrain05m.github.io/

2602.10195 2026-02-27 cs.LG cs.AI hep-th

Versor: A Geometric Sequence Architecture

Truong Minh Huy, Edward Hirst

Comments 19+28 pages, 5 figures

2602.05597 2026-02-27 cs.AI cs.HC cs.MA

Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents

Stephen Pilli, Vivek Nallur

Comments Accepted at CHI'26. The text overlap with arXiv:2601.11049 is arising from the commonalities in the Appendix due to shared experimental material

2602.05535 2026-02-27 cs.LG

Detecting Misbehaviors of Large Vision-Language Models by Evidential Uncertainty Quantification

Tao Huang, Rui Wang, Xiaofei Liu, Yi Qin, Li Duan, Liping Jing

Comments Accepted to ICLR 2026. Code is available at https://github.com/HT86159/EUQ

2602.02334 2026-02-27 cs.CV cs.AI cs.LG

VQ-Style: Disentangling Style and Content in Motion with Residual Quantized Representations

Fatemeh Zargarbashi, Dhruv Agrawal, Jakob Buhmann, Martin Guay, Stelian Coros, Robert W. Sumner

2602.01749 2026-02-27 cs.AI cs.LG

Controlling Exploration-Exploitation in GFlowNets via Markov Chain Perspectives

Lin Chen, Samuel Drapeau, Fanghao Shao, Xuekai Zhu, Bo Xue, Yunchong Song, Mathieu Laurière, Zhouhan Lin

2602.01434 2026-02-27 cs.LG math.ST stat.TH

Phase Transitions for Feature Learning in Neural Networks

Andrea Montanari, Zihao Wang

Comments 75 pages; 17 pdf figures; v2 is a minor revision of v1

2602.00564 2026-02-27 cs.AI cs.CL

Unmasking Reasoning Processes: A Process-aware Benchmark for Evaluating Structural Mathematical Reasoning in LLMs

Xiang Zheng, Weiqi Zhai, Wei Wang, Boyu Yang, Wenbo Li, Ruixiang Luo, Haoxiang Sun, Yucheng Wang, Zhengze Li, Meng Wang, Yuetian Du, Guojie Lin, Yaxuan Wang, Xiaoxiao Xu, Yanhu Mo, Xuan Ren, Hu Wei, Bing Zhao

Comments 8 pages, and 3 figures

2602.00299 2026-02-27 cs.LG

Agentic Framework for Epidemiological Modeling

Rituparna Datta, Zihan Guan, Baltazar Espinoza, Yiqi Su, Priya Pitre, Srini Venkatramanan, Naren Ramakrishnan, Anil Vullikanti