arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.12541 2026-03-16 cs.LG cs.SY eess.SY

As Language Models Scale, Low-order Linear Depth Dynamics Emerge

Buddhika Nettasinghe, Geethu Joseph

详情

英文摘要

Large language models are often viewed as high-dimensional nonlinear systems and treated as black boxes. Here, we show that transformer depth dynamics admit accurate low-order linear surrogates within context. Across tasks including toxicity, irony, hate speech and sentiment, a 32-dimensional linear surrogate reproduces the layerwise sensitivity profile of GPT-2-large with near-perfect agreement, capturing how the final output shifts under additive injections at each layer. We then uncover a surprising scaling principle: for a fixed-order linear surrogate, agreement with the full model improves monotonically with model size across the GPT-2 family. This linear surrogate also enables principled multi-layer interventions that require less energy than standard heuristic schedules when applied to the full model. Together, our results reveal that as language models scale, low-order linear depth dynamics emerge within contexts, offering a systems-theoretic foundation for analyzing and controlling them.

URL PDF HTML ☆

赞 0 踩 0

2603.12540 2026-03-16 cs.LG cs.AI

Embedded Quantum Machine Learning in Embedded Systems: Feasibility, Hybrid Architectures, and Quantum Co-Processors

Somdip Dey, Syed Muhammad Raza

Comments 6 pages, 1 figure, 5th International Conference Computing, Mathematics & Engineering Technologies (iCoMET 2026)

2603.12538 2026-03-16 cs.CV cs.AI

Spatio-Semantic Expert Routing Architecture with Mixture-of-Experts for Referring Image Segmentation

Alaa Dalaq, Muzammil Behzad

2603.12520 2026-03-16 cs.LG cs.AI cs.CL

When LLM Judge Scores Look Good but Best-of-N Decisions Fail

Eddie Landesberg

2603.12517 2026-03-16 cs.LG cs.CV

Curriculum Sampling: A Two-Phase Curriculum for Efficient Training of Flow Matching

Pengwei Sun

2603.12516 2026-03-16 cs.LG physics.flu-dyn

Learning Pore-scale Multiphase Flow from 4D Velocimetry

Chunyang Wang, Linqi Zhu, Yuxuan Gu, Robert van der Merwe, Xin Ju, Catherine Spurin, Samuel Krevor, Rex Ying, Tobias Pfaff, Martin J. Blunt, Tom Bultreys, Gege Wen

2603.12513 2026-03-16 cs.CV

MemRoPE: Training-Free Infinite Video Generation via Evolving Memory Tokens

Youngrae Kim, Qixin Hu, C. -C. Jay Kuo, Peter A. Beerel

Comments 9 pages main, 3 pages references, 6 pages appendix. Project page: https://memrope.github.io

2603.12512 2026-03-16 cs.LG

Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness

Arman Bolatov, Samuel Horváth, Martin Takáč, Eduard Gorbunov

Comments 10 pages, 1 table, 4 figures, accepted to CPAL 2026

2603.12506 2026-03-16 cs.CV cs.AI cs.LG

Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation

Joong Ho Kim, Nicholas Thai, Souhardya Saha Dip, Dong Lao, Keith G. Mills

Comments Code available at https://github.com/LSU-ATHENA/Naive-PAINE

2603.12505 2026-03-16 cs.RO

Robots that redesign themselves through kinematic self-destruction

Chen Yu, Sam Kriegman

2603.12499 2026-03-16 cs.LG

Probing Length Generalization in Mamba via Image Reconstruction

Jan Rathjens, Robin Schiewer, Laurenz Wiskott, Anand Subramoney

2603.12493 2026-03-16 cs.CV

RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution

Ali Mosleh, Faraz Ali, Fengjia Zhang, Stavros Tsogkas, Junyong Lee, Alex Levinshtein, Michael S. Brown

Comments This paper has been accepted to The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

2603.12488 2026-03-16 cs.RO

COAD: Constant-Time Planning for Continuous Goal Manipulation with Compressed Library and Online Adaptation

Adil Shiyas, Zhuoyun Zhong, Constantinos Chamzas

Comments Adil Shiyas and Zhuoyun Zhong contributed equally to this work

2603.12487 2026-03-16 cs.LG

Modal Logical Neural Networks for Financial AI

Antonin Sulc

Comments 4 pages, 1 figure, Accepted at ICLR 2026 FinAI

2603.12483 2026-03-16 cs.AI cs.LG

Generating Expressive and Customizable Evals for Timeseries Data Analysis Agents with AgentFuel

Aadyaa Maddi, Prakhar Naval, Deepti Mande, Shane Duan, Muckai Girish, Vyas Sekar

2603.12482 2026-03-16 cs.CV

CalliMaster: Mastering Page-level Chinese Calligraphy via Layout-guided Spatial Planning

Tianshuo Xu, Tiantian Hong, Zhifei Chen, Fei Chao, Ying-cong Chen

2603.12480 2026-03-16 cs.RO cs.AI

One-Step Flow Policy: Self-Distillation for Fast Visuomotor Policies

Shaolong Li, Lichao Sun, Yongchao Chen

2603.12478 2026-03-16 cs.CV cs.LG

Less Data, Faster Convergence: Goal-Driven Data Optimization for Multimodal Instruction Tuning

Rujie Wu, Haozhe Zhao, Hai Ci, Yizhou Wang

2603.12471 2026-03-16 cs.CL cs.HC

Marked Pedagogies: Examining Linguistic Biases in Personalized Automated Writing Feedback

Mei Tan, Lena Phalen, Dorottya Demszky

Comments To appear in LAK 2026

2603.12468 2026-03-16 cs.CV

Adaptation of Weakly Supervised Localization in Histopathology by Debiasing Predictions

Alexis Guichemerre, Banafsheh Karimian, Soufiane Belharbi, Natacha Gillet, Nicolas Thome, Pourya Shamsolmoali, Mohammadhadi Shateri, Luke McCaffrey, Eric Granger

Comments 10 pages, 4 figures

详情

英文摘要

Weakly Supervised Object Localization (WSOL) models enable joint classification and region-of-interest localization in histology images using only image-class supervision. When deployed in a target domain, distributions shift remains a major cause of performance degradation, especially when applied on new organs or institutions with different staining protocols and scanner characteristics. Under stronger cross-domain shifts, WSOL predictions can become biased toward dominant classes, producing highly skewed pseudo-label distributions in the target domain. Source-Free (Unsupervised) Domain Adaptation (SFDA) methods are commonly employed to address domain shift. However, because they rely on self-training, the initial bias is reinforced over training iterations, degrading both classification and localization tasks. We identify this amplification of prediction bias as a primary obstacle to the SFDA of WSOL models in histopathology. This paper introduces \sfdadep, a method inspired by machine unlearning that formulates SFDA as an iterative process of identifying and correcting prediction bias. It periodically identifies target images from over-predicted classes and selectively reduces the predictive confidence for uncertain (high entropy) images, while preserving confident predictions. This process reduces the drift of decision boundaries and bias toward dominant classes. A jointly optimized pixel-level classifier further restores discriminative localization features under distribution shift. Extensive experiments on cross-organ and -center histopathology benchmarks (glas, CAMELYON-16, CAMELYON-17) with several WSOL models show that SFDA-DeP consistently improves classification and localization over state-of-the-art SFDA baselines. {\small Code: \href{https://anonymous.4open.science/r/SFDA-DeP-1797/}{anonymous.4open.science/r/SFDA-DeP-1797/}}

URL PDF HTML ☆

赞 0 踩 0

2603.12460 2026-03-16 cs.RO

Predictive and adaptive maps for long-term visual navigation in changing environments

Lucie Halodova, Eliska Dvorakova, Filip Majer, Tomas Vintr, Oscar Martinez Mozos, Feras Dayoub, Tomas Krajnik

2603.12459 2026-03-16 cs.LG cs.CV

Bases of Steerable Kernels for Equivariant CNNs: From 2D Rotations to the Lorentz Group

Alan Garbarz

Comments 28 pages. Comments are welcome

2603.12458 2026-03-16 cs.CL cs.AI

Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs

Xing Zi, Xinying Zhou, Jinghao Xiao, Catarina Moreira, Mukesh Prasad

2603.12430 2026-03-16 cs.CV

Surg-R1: A Hierarchical Reasoning Foundation Model for Scalable and Interpretable Surgical Decision Support with Multi-Center Clinical Validation

Jian Jiang, Chenxi Lin, Yiming Gu, Zengyi Qin, Zhitao Zeng, Kun Yuan, Yonghao Long, Xiang Xia, Cheng Yuan, Yuqi Wang, Zijie Yue, Kunyi Yang, Yuting Zhang, Zhu Zhuo, Dian Qin, Xin Wang, NG Chi Fai, Brian Anthony, Daguang Xu, Guy Rosman, Ozanan Meireles, Zizhen Zhang, Nicolas Padoy, Hesheng Wang, Qi Dou, Yueming Jin, Yutong Ban

2603.12423 2026-03-16 cs.CL

Interpreting Negation in GPT-2: Layer- and Head-Level Causal Analysis

Abdullah Al Mofael, Lisa M. Kuhn, Ghassan Alkadi, Kuo-Pao Yang

Comments 9 pages, 4 figures, 1 table. Accepted at the 2026 IEEE 16th Annual Computing and Communication Workshop and Conference (CCWC)

详情

DOI: 10.1109/CCWC67433.2026.11393646
Journal ref: 2026 IEEE 16th Annual Computing and Communication Workshop and Conference (CCWC), 2026, pp. 42-50

英文摘要

Negation remains a persistent challenge for modern language models, often causing reversed meanings or factual errors. In this work, we conduct a causal analysis of how GPT-2 Small internally processes such linguistic transformations. We examine its hidden representations at both the layer and head level. Our analysis is based on a self-curated 12,000-pair dataset of matched affirmative and negated sentences, covering multiple linguistic templates and forms of negation. To quantify this behavior, we define a metric, the Negation Effect Score (NES), which measures the model's sensitivity in distinguishing between affirmative statements and their negations. We carried out two key interventions to probe causal structure. In activation patching, internal activations from affirmative sentences were inserted into their negated counterparts to see how meaning shifted. In ablation, specific attention heads were temporarily disabled to observe how logical polarity changed. Together, these steps revealed how negation signals move and evolve through GPT-2's layers. Our findings indicate that this capability is not widespread; instead, it is highly concentrated within a limited number of mid-layer attention heads, primarily within layers 4 to 6. Ablating these specific components directly disrupts the model's negation sensitivity: on our in-domain, ablation increased NES (indicating weaker negation sensitivity), and re-introducing cached affirmative activations (rescue) increased NES further, confirming that these heads carry affirmative signal rather than restoring baseline behavior. On xNot360, ablation slightly decreased NES and rescue restored performance above baseline. This pattern demonstrates that these causal patterns are consistent across various negation forms and remain detectable on the external xNot360 benchmark, though with smaller magnitude.

URL PDF HTML ☆

赞 0 踩 0

2603.12421 2026-03-16 cs.CV

A Neuro-Symbolic Framework Combining Inductive and Deductive Reasoning for Autonomous Driving Planning

Hongyan Wei, Wael AbdAlmageed

Comments Under review. 16 pages, 2 figures

2603.12414 2026-03-16 cs.LG cs.CR

SpectralGuard: Detecting Memory Collapse Attacks in State Space Models

Davi Bonetto

Comments 24 pages, 10 figures. Code, dataset, and demo: https://github.com/DaviBonetto/spectralguard

2603.12409 2026-03-16 cs.CV

ABRA: Teleporting Fine-Tuned Knowledge Across Domains for Open-Vocabulary Object Detection

Mattia Bernardi, Chiara Cappellino, Matteo Mosconi, Enver Sangineto, Angelo Porrello, Simone Calderara

2603.12408 2026-03-16 cs.RO cs.LG

Beyond Motion Imitation: Is Human Motion Data Alone Sufficient to Explain Gait Control and Biomechanics?

Xinyi Liu, Jangwhan Ahn, Edgar Lobaton, Jennie Si, He Huang

Comments 8 pages, 7 figures

2603.12397 2026-03-16 cs.CL

Not Just the Destination, But the Journey: Reasoning Traces Causally Shape Generalization Behaviors

Pengcheng Wen, Yanxu Zhu, Jiapeng Sun, Han Zhu, Yujin Zhou, Chi-Min Chan, Sirui Han, Yike Guo