arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.06015 2026-04-08 cs.AI

How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism

Elisabetta Rocchetti, Alfio Ferrara

详情

英文摘要

Instruction tuning is commonly assumed to endow language models with a domain-general ability to follow instructions, yet the underlying mechanism remains poorly understood. Does instruction-following rely on a universal mechanism or compositional skill deployment? We investigate this through diagnostic probing across nine diverse tasks in three instruction-tuned models. Our analysis provides converging evidence against a universal mechanism. First, general probes trained across all tasks consistently underperform task-specific specialists, indicating limited representational sharing. Second, cross-task transfer is weak and clustered by skill similarity. Third, causal ablation reveals sparse asymmetric dependencies rather than shared representations. Tasks also stratify by complexity across layers, with structural constraints emerging early and semantic tasks emerging late. Finally, temporal analysis shows constraint satisfaction operates as dynamic monitoring during generation rather than pre-generation planning. These findings indicate that instruction-following is better characterized as skillful coordination of diverse linguistic capabilities rather than deployment of a single abstract constraint-checking process.

URL PDF HTML ☆

赞 0 踩 0

2604.06013 2026-04-08 cs.AI cs.CL

Epistemic Blinding: An Inference-Time Protocol for Auditing Prior Contamination in LLM-Assisted Analysis

Michael Cuccarese

Comments code and LLM skill at: https://github.com/mcuccarese/epistemic-blinding 7 pages 3 figures

详情

英文摘要

This paper presents epistemic blinding in the context of an agentic system that uses large language models to reason across multiple biological datasets for drug target prioritization. During development, it became apparent that LLM outputs silently blend data-driven inference with memorized priors about named entities - and the blend is invisible: there is no way to determine, from a single output, how much came from the data on the page and how much came from the model's training memory. Epistemic blinding is a simple inference-time protocol that replaces entity identifiers with anonymous codes before prompting, then compares outputs against an unblinded control. The protocol does not make LLM reasoning deterministic, but it restores one critical axis of auditability: measuring how much of an output came from the supplied data versus the model's parametric knowledge. The complete target identification system is described - including LLM-guided evolutionary optimization of scoring functions and blinded agentic reasoning for target rationalization - with demonstration that both stages operate without access to entity identity. In oncology drug target prioritization across four cancer types, blinding changes 16% of top-20 predictions while preserving identical recovery of validated targets. The contamination problem is shown to generalize beyond biology: in S&P 500 equity screening, brand-recognition bias reshapes 30-40% of top-20 rankings across five random seeds. To lower the barrier to adoption, the protocol is released as an open-source tool and as a Claude Code skill that enables one-command epistemic blinding within agentic workflows. The claim is not that blinded analysis produces better results, but that without blinding, there is no way to know to what degree the agent is adhering to the analytical process the researcher designed.

URL PDF HTML ☆

赞 0 踩 0

2604.06010 2026-04-08 cs.CV

OmniCamera: A Unified Framework for Multi-task Video Generation with Arbitrary Camera Control

Yukun Wang, Ruihuang Li, Jiale Tao, Shiyuan Yang, Liyi Chen, Zhantao Yang, Handz, Yulan Guo, Shuai Shao, Qinglin Lu

2604.06005 2026-04-08 cs.CL

Disentangling MLP Neuron Weights in Vocabulary Space

Asaf Avrahamy, Yoav Gur-Arieh, Mor Geva

2604.05998 2026-04-08 cs.RO cs.SY eess.SY

Force Polytope-Based Cant-Angle Selection for Tilting Hexarotor UAVs

Alberto Piccina, Massimiliano Bertoni, Angelo Cenedese, Giulia Michieletto

2604.05995 2026-04-08 cs.CL cs.AI cs.LG

The Model Agreed, But Didn't Learn: Diagnosing Surface Compliance in Large Language Models

Xiaojie Gu, Ziying Huang, Weicong Hong, Jian Xie, Renze Lou, Kai Zhang

Comments ACL 2026 Findings

2604.05993 2026-04-08 cs.LG stat.ML

Data Distribution Valuation Using Generalized Bayesian Inference

Cuong N. Nguyen, Cuong V. Nguyen

Comments Paper published at AISTATS 2026

2604.05987 2026-04-08 cs.AI

Flowr -- Scaling Up Retail Supply Chain Operations Through Agentic AI in Large Scale Supermarket Chains

Eranga Bandara, Ross Gore, Sachin Shetty, Piumi Siyambalapitiya, Sachini Rajapakse, Isurunima Kularathna, Pramoda Karunarathna, Ravi Mukkamala, Peter Foytik, Safdar H. Bouk, Abdul Rahman, Xueping Liang, Amin Hass, Tharaka Hewa, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, Nilaan Loganathan

2604.05978 2026-04-08 cs.RO

Automating Manual Tasks through Intuitive Robot Programming and Cognitive Robotics

Bijan Kavousian, Petar Tesic, Oliver Petrovic, Christian Brecher

Comments This submission contains both an English translation and the original German version. The German version was originally published in the Proceedings of the 71st GfA Conference (2025)

2604.05971 2026-04-08 cs.CV cs.CL

Is CLIP Cross-Eyed? Revealing and Mitigating Center Bias in the CLIP Family

Oscar Chew, Hsiao-Ying Huang, Kunal Jain, Tai-I Chen, Khoa D Doan, Kuan-Hao Huang

2604.05967 2026-04-08 cs.LG math.DS math.OC

On Dominant Manifolds in Reservoir Computing Networks

Noa Kaplan, Alberto Padoan, Anastasia Bizyaeva

Comments 6 pages, 3 figures

2604.05965 2026-04-08 cs.AI

Beyond Compromise: Pareto-Lenient Consensus for Efficient Multi-Preference LLM Alignment

Renxuan Tan, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

2604.05961 2026-04-08 cs.CV

HumANDiff: Articulated Noise Diffusion for Motion-Consistent Human Video Generation

Tao Hu, Varun Jampani

Comments Project page: https://taohuumd.github.io/projects/HumANDiff/

2604.05960 2026-04-08 cs.LG

A Mixture of Experts Foundation Model for Scanning Electron Microscopy Image Analysis

Sk Miraj Ahmed, Yuewei Lin, Chuntian Cao, Shinjae Yoo, Xinpei Wu, Won-Il Lee, Nikhil Tiwale, Dan N. Le, Thi Thu Huong Chu, Jiyoung Kim, Kevin G. Yager, Chang-Yong Nam

2604.05954 2026-04-08 cs.RO

You're Pushing My Buttons: Instrumented Learning of Gentle Button Presses

Raman Talwar, Remko Proesmans, Thomas Lips, Andreas Verleysen, Francis wyffels

Comments icra 2026 workshop paper

2604.05952 2026-04-08 cs.AI cs.CL

Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration

Yi Yuan, Xuhong Wang, Shanzhe Lei

Comments 20 pages, 3 tables, 2 figures

2604.05947 2026-04-08 cs.CV

Mixture-of-Modality-Experts with Holistic Token Learning for Fine-Grained Multimodal Visual Analytics in Driver Action Recognition

Tianyi Liu, Yiming Li, Wenqian Wang, Jiaojiao Wang, Chen Cai, Yi Wang, Kim-Hui Yap

Comments 11 pages, 3 figures

2604.05943 2026-04-08 cs.AI

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Maria Nesterova, Mikhail Kolosov, Anton Andreychuk, Egor Cherepanov, Oleg Bulichev, Alexey Kovalev, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik

Comments Accepted at AAMAS 2026 (AAAI Track)

2604.05942 2026-04-08 cs.CL

BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs

Abbas Ghaddar, Ivan Kobyzev, Boxing Chen, Yufei Cui

Comments ACL 2026 (Main Conference)

2604.05939 2026-04-08 cs.AI cs.HC

Context-Value-Action Architecture for Value-Driven Large Language Model Agents

TianZe Zhang, Sirui Sun, Yuhang Xie, Xin Zhang, Zhiqiang Wu, Guojie Song

Comments Accepted to Findings of the Association for Computational Linguistics: ACL 2026

2604.05934 2026-04-08 cs.CV eess.IV

Leveraging Image Editing Foundation Models for Data-Efficient CT Metal Artifact Reduction

Ahmet Rasim Emirdagi, Süleyman Aslan, Mısra Yavuz, Görkay Aydemir, Yunus Bilge Kurt, Nasrin Rahimi, Burak Can Biner, M. Akın Yılmaz

Comments Accepted to CVPRW 2026 Med-Reasoner

2604.05931 2026-04-08 cs.CV cs.AI

Saliency-Guided Representation with Consistency Policy Learning for Visual Unsupervised Reinforcement Learning

Jingbo Sun, Qichao Zhang, Songjun Tu, Xing Fang, Yupeng Zheng, Haoran Li, Ke Chen, Dongbin Zhao

2604.05930 2026-04-08 cs.CL cs.AI

"I See What You Did There": Can Large Vision-Language Models Understand Multimodal Puns?

Naen Xu, Jiayi Sheng, Changjiang Li, Chunyi Zhou, Yuyuan Li, Tianyu Du, Jun Wang, Zhihui Fu, Jinbao Li, Shouling Ji

Comments ACL 2026 Main

2604.05929 2026-04-08 cs.LG cs.AI cs.DM

ReLU Networks for Exact Generation of Similar Graphs

Mamoona Ghafoor, Tatsuya Akutsu

2604.05923 2026-04-08 cs.LG cs.CL

The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model

Hongxu Zhou

2604.05912 2026-04-08 cs.CL

FrontierFinance: A Long-Horizon Computer-Use Benchmark of Real-World Financial Tasks

Michael Krumdick, Varshini Reddy, Shivani Chaudhary, William Day, Maarij Ahmed, Hayan Haqqi, Muhammad Ahsen Fahim, Hanzallah Amjad, Ahmad Orakzai, Aqsa Gul, Chris Tanner

2604.05908 2026-04-08 cs.CV

Appearance Decomposition Gaussian Splatting for Multi-Traversal Reconstruction

Yangyi Xiao, Siting Zhu, Baoquan Yang, Tianchen Deng, Yongbo Chen, Hesheng Wang

2604.05906 2026-04-08 cs.CV cs.AI

Selective Aggregation of Attention Maps Improves Diffusion-Based Visual Interpretation

Jungwon Park, Jungmin Ko, Dongnam Byun, Wonjong Rhee

2604.05900 2026-04-08 cs.CV

AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

Dong She, Xianrong Yao, Liqun Chen, Jinghe Yu, Yang Gao, Zhanpeng Jin

Comments Accepted by Findings of ACL 2026

2604.05899 2026-04-08 cs.CL

FRENCH-YMCA: A FRENCH Corpus meeting the language needs of Youth, froM Children to Adolescents

Cherifa Ben Khelil, Jean-Yves Antoine, Anaïs Halftermeyer, Frédéric Rayar, Mathieu Thebaud

Comments 5 pages, 1 figure