arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.15409 2026-04-20 cs.LG cs.AI

The Illusion of Equivalence: Systematic FP16 Divergence in KV-Cached Autoregressive Inference

Ranjith Chodavarapu, Lei Xu

详情

英文摘要

KV caching is a ubiquitous optimization in autoregressive transformer inference, long presumed to be numerically equivalent to cache-free computation. This assumption fails under standard FP16 precision: cache-ON and cache-OFF execution paths employ different floating-point accumulation orderings which, due to FP16 non-associativity, produce a deterministic divergence in decoded token sequences. Across three open-weight models (LLaMA-2-7B, Mistral-7B-v0.3, Gemma-2-2B) evaluated on GSM8K, we observe a 100\% token divergence rate across all sampling strategies, including greedy decoding, which rules out sampling randomness as a cause, and also with cache-ON yielding higher accuracy in 8 of 9 conditions, where the accuracy difference serves as an indicator that the divergence direction is systematic rather than random. Controlled FP32 falsification reduces divergence by eight orders of magnitude, eliminates token flips, and drops the flip rate to exactly 0.0\%, confirming FP16 non-associativity as the sole causal driver. Layer-wise drift profiling reveals architecturally predictable propagation patterns: models using Grouped-Query Attention exhibit sharp divergence at the first layer, while Gemma's larger head dimension and sliding window attention produce uniform accumulation across all layers. Finally, activation patching of the entire residual stream fails to recover the cache-free trajectory, localizing the causal variable to the stateful KV cache. These findings establish that FP16 KV cache inference is fundamentally non-equivalent to recomputation and provide a mechanistic framework for understanding numerical instability in modern LLM inference systems.

URL PDF HTML ☆

赞 0 踩 0

2604.15400 2026-04-20 cs.LG cs.AI cs.CL

Hallucination as Trajectory Commitment: Causal Evidence for Asymmetric Attractor Dynamics in Transformer Generation

G. Aytug Akarlar

Comments 21 pages, 12 figures, 8 tables. Code and data: https://github.com/akarlaraytu/trajectory-commitment

2604.15398 2026-04-20 cs.LG cs.NA math.NA

Python library supporting Discrete Variational Formulations and training solutions with Collocation-based Robust Variational Physics Informed Neural Networks (DVF-CRVPINN)

Tomasz Służalec, Marcin Łoś, Askold Vilkha, Maciej Paszyński

Comments Python library, Robust Variational Physics-Informed Neural Networks, Collocation Methods, Robust loss, Stokes Equations, Laplace problem

2604.15395 2026-04-20 cs.RO

Foundation Models in Robotics: A Comprehensive Review of Methods, Models, Datasets, Challenges and Future Research Directions

Aggelos Psiris, Vasileios Argyriou, Evangelos K. Markakis, Panagiotis Sarigiannidis, Efstratios Gavves, Kostas Bekris, Arash Ajoudani adn Georgios Th. Papadopoulos

2604.15392 2026-04-20 cs.LG cs.AI stat.ML

Lightweight Geometric Adaptation for Training Physics-Informed Neural Networks

Kang An, Chenhao Si, Shiqian Ma, Ming Yan

Comments 22 pages, Chenhao Si and Kang An contributed equally to this work. Their authorship order was determined randomly

2604.15383 2026-04-20 cs.SD cs.AI

Temporal Contrastive Decoding: A Training-Free Method for Large Audio-Language Models

Yanda Li, Yuhan Liu, Zirui Song, Yunchao Wei, Martin Takáč, Salem Lahlou

Comments ACL 2026 Findings

2604.15377 2026-04-20 cs.LG cs.CV cs.MM

M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention

Sanjeev Panta, Rhett M Morvant, Xu Yuan, Li Chen, Nian-Feng Tzeng

Comments Accepted at IEEE International Conference on Multimedia and Expo (ICME) 2026

2604.15376 2026-04-20 cs.CV cs.AI

Zoom Consistency: A Free Confidence Signal in Multi-Step Visual Grounding Pipelines

Keon Kim, Krish Chelikavada

2604.15371 2026-04-20 cs.CL cs.AI cs.LG

Applied Explainability for Large Language Models: A Comparative Study

Venkata Abhinandan Kancharla

Comments 14 pages, 3 figures, comparative study of explainability methods for transformer-based NLP models; also available on Zenodo

2604.15360 2026-04-20 cs.LG cs.SY eess.SY

Mapping High-Performance Regions in Battery Scheduling across Data Uncertainty, Battery Design, and Planning Horizons

Jaime de Miguel Rodriguez, Artjom Vargunin, Brigitta Robin Raudne, David Solis Martin, Yaroslava Mykhailenko, Kaarel Oja

Comments 40 pages

2604.15356 2026-04-20 cs.LG cs.AI cs.IT cs.NE math.IT

Sequential KV Cache Compression via Probabilistic Language Tries: Beyond the Per-Vector Shannon Limit

Gregory Magarshak

Comments 22 Pages

2604.15351 2026-04-20 cs.LG cs.CL

Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures

Abdulmalek Saket

Comments 11 pages, 5 figures, 2 frozen evidence campaigns, 81 experiment rows across 14 successful models and 8 architecture families, plus one documented failed Pythia/GPT-NeoX attempt

2604.15350 2026-04-20 cs.LG

The Spectral Geometry of Thought: Phase Transitions, Instruction Reversal, Token-Level Dynamics, and Perfect Correctness Prediction in How Transformers Reason

Yi Liu

2604.15301 2026-04-20 cs.CV

Think in Latent Thoughts: A New Paradigm for Gloss-Free Sign Language Translation

Yiyang Jiang, Li Zhang, Xiao-Yong Wei, Li Qing

Comments Accepted to ACL 2026 Main

2604.15284 2026-04-20 cs.CV

GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens

Roni Itkin, Noam Issachar, Yehonatan Keypur, Xingyu Chen, Anpei Chen, Sagie Benaim

2604.15001 2026-04-20 cs.AI

COEVO: Co-Evolutionary Framework for Joint Functional Correctness and PPA Optimization in LLM-Based RTL Generation

Heng Ping, Peiyu Zhang, Shixuan Li, Wei Yang, Anzhe Cheng, Shukai Duan, Xiaole Zhang, Paul Bogdan

2604.14967 2026-04-20 cs.CV cs.AI

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Jun Wang, Shuo Tan, Zelong Sun, Tiancheng Gu, Yongle Zhao, Ziyong Feng, Kaicheng Yang, Zhiwu Lu

Comments 17 pages, 11 figures

2604.14646 2026-04-20 cs.AI

Targeted Exploration via Unified Entropy Control for Reinforcement Learning

Chen Wang, Lai Wei, Yanzhi Zhang, Chenyang Shao, Zedong Dan, Weiran Huang, Ge Lan, Yue Wang

Comments Accepted for publication in Findings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

2604.14605 2026-04-20 cs.CV

Towards Design Compositing

Abhinav Mahajan, Abhikhya Tripathy, Sudeeksha Reddy Pala, Vaibhav Methi, K J Joseph, Balaji Vasan Srinivasan

Comments Accepted to CVEU workshop at CVPR 2026

2604.14518 2026-04-20 cs.AI

Mind DeepResearch Technical Report

MindDR Team, Li Auto Inc

2604.14388 2026-04-20 cs.CV

FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images

Sabab Ishraq, Aarushi Aarushi, Juncai Jiang, Chen Chen

2604.14373 2026-04-20 cs.CV cs.AI

SatBLIP: Context Understanding and Feature Identification from Satellite Imagery with Vision-Language Learning

Xue Wu, Shengting Cao, Shenglin Li, Jiaqi Gong

2604.14174 2026-04-20 cs.CL cs.LG

Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters

Bryan Sanchez

Comments 12 pages, 3 figures, code at https://github.com/SolomonB14D3/qwen-adapter-correction

2604.13846 2026-04-20 cs.CL

Beyond Static Personas: Situational Personality Steering for Large Language Models

Zesheng Wei, Mengxiang Li, Zilei Wang, Yang Deng

Comments Accepted to Findings of ACL2026

2604.13660 2026-04-20 cs.CV

VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection

Hui Han, Shunli Wang, Yandan Zhao, Taiping Yao, Shouhong Ding

2604.13508 2026-04-20 cs.CV

Enhancing Mixture-of-Experts Specialization via Cluster-Aware Upcycling

Sanghyeok Chu, Pyunghwan Ahn, Gwangmo Song, SeungHwan Kim, Honglak Lee, Bohyung Han

Comments Accepted to CVPR 2026

2604.13226 2026-04-20 cs.LG cs.AI

KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

Chuangtao Chen, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Bing Li, Ulf Schlichtmann

2604.13081 2026-04-20 cs.LG cs.AI cs.NE

Selectivity and Shape in the Design of Forward-Forward Goodness Functions

Talha Ruzgar Akkus, Suayp Talha Kocabay, Kamer Ali Yuksel, Hassan Sawaf

2604.13058 2026-04-20 cs.CL cs.LG cs.MM

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

Nahyun Lee, Guijin Son, Hyunwoo Ko, Chanyoung Kim, JunYoung An, Kyubeen Han, Il-Youp Kwak

Comments 8 pages

2604.12617 2026-04-20 cs.LG cs.AI

SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models

You Qin, Linqing Wang, Hao Fei, Roger Zimmermann, Liefeng Bo, Qinglin Lu, Chunyu Wang