arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2411.05961 2026-04-08 cs.CV cs.AI

Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models

Xiao Liu, Lijun Zhang, Deepak Ganesan, Hui Guan

Comments I found a big mistake in the paper that causes significant bias on the results. The residual links are not taken into consideration when computing the transmission. All results about the compressed data size and transmission latency would be affected

2411.05183 2026-04-08 cs.CV cs.LG

Why CNN Features Are not Gaussian: A Statistical Anatomy of Deep Representations

David Chapman, Parniyan Farvardin

2410.10238 2026-04-08 cs.CV cs.AI

ForgeryGPT: A Multimodal LLM for Interpretable Image Forgery Detection and Localization

Fanrui Zhang, Jiawei Liu, Jiaying Zhu, Esther Sun, Dong Li, Qiang Zhang, Zheng-Jun Zha

Comments 13 pages, 9 figures

2409.07388 2026-04-08 cs.CL

Recent Advances in Multimodal Affective Computing: An NLP Perspective

Guimin Hu, Weimin Lyu, Chang Sun, Zhihong Zhu, Lin Gui, Ruichu Cai, Erik Cambria, Hasti Seifi

2408.05366 2026-04-08 cs.CV

The DeepSpeak Dataset

Sarah Barrington, Maty Bohacek, Hany Farid

Comments https://github.com/hfaridlab/deepspeak

2406.14514 2026-04-08 cs.AI

Solving a Stackelberg Game on Transportation Networks in a Dynamic Crime Scenario: A Mixed Approach on Multi-Layer Networks

Sukanya Samanta, Kei Kimura, Makoto Yokoo, Palash Dey

2405.06535 2026-04-08 cs.CV cs.LG

Controllable Image Generation with Composed Parallel Token Prediction

Jamie Stirling, Noura Al-Moubayed, Chris G. Willcocks, Hubert P. H. Shum

Comments 8 pages + references, 7 figures, accepted to CVPR Workshops 2026 (LoViF)

2403.13027 2026-04-08 cs.LG cs.CR cs.IT math.IT stat.ML

Towards Better Statistical Understanding of Watermarking LLMs

Zhongze Cai, Shang Liu, Hanzhao Wang, Huaiyang Zhong, Xiaocheng Li

2402.01370 2026-04-08 cs.RO cs.SY eess.SY

CC-VPSTO: Chance-Constrained Via-Point-Based Stochastic Trajectory Optimisation for Online Robot Motion Planning under Uncertainty

Lara Brudermüller, Guillaume Berger, Julius Jankowski, Raunak Bhattacharyya, Raphaël Jungers, Nick Hawes

Comments 23 pages, 12 figures, submitted to International Journal of Robotics Research

2307.02719 2026-04-08 cs.LG stat.ML

Understanding Uncertainty Sampling via Equivalent Loss

Shang Liu, Xiaocheng Li

Comments An updated version of the previous paper titled "Understanding Uncertainty Sampling". Added a major result of sample complexity and other theoretical results; cut the experiment part

2304.03057 2026-04-08 cs.RO

Distributed UAV Formation Control Robust to Relative Pose Measurement Noise

Viktor Walter, Matouš Vrba, Daniel Bonilla Licea, Matej Hilmer, Martin Saska

Comments Submitted to Robotics and Autonomous Systems journal on May 10. 2025 (Revision on February 27. 2026)

2201.01984 2026-04-08 cs.CV cs.CL

Image Captioning via Compact Bidirectional Architecture

Zijie Song, Yuanen Zhou, Zhenzhen Hu, Daqing Liu, Huixia Ben, Richang Hong, Meng Wang

详情

DOI: 10.1109/TMM.2026.3680397

英文摘要

Most current image captioning models typically generate captions from left-to-right. This unidirectional property makes them can only leverage past context but not future context. Though refinement-based models can exploit both past and future context by generating a new caption in the second stage based on pre-retrieved or pre-generated captions in the first stage, the decoder of these models generally consists of two networks~(i.e. a retriever or captioner in the first stage and a captioner in the second stage), which can only be executed sequentially. In this paper, we introduce a Compact Bidirectional Transformer model for image captioning that can leverage bidirectional context implicitly and explicitly while the decoder can be executed parallelly. Specifically, it is implemented by tightly coupling left-to-right(L2R) and right-to-left(R2L) flows into a single compact model to serve as a regularization for implicitly exploiting bidirectional context and optionally allowing explicit interaction of the bidirectional flows, while the final caption is chosen from either L2R or R2L flow in a sentence-level ensemble manner. We conduct extensive ablation studies on MSCOCO benchmark and find that the compact bidirectional architecture and the sentence-level ensemble play more important roles than the explicit interaction mechanism. By combining with word-level ensemble seamlessly, the effect of sentence-level ensemble is further enlarged. We further extend the conventional one-flow self-critical training to the two-flows version under this architecture and achieve new state-of-the-art results in comparison with non-vision-language-pretraining models. Finally, we verify the generality of this compact bidirectional architecture by extending it to LSTM backbone. Source code is available at https://github.com/YuanEZhou/cbtic.

URL PDF HTML ☆

赞 0 踩 0

2604.06148 2026-04-08 cs.CR cs.AI cs.MA

Who Governs the Machine? A Machine Identity Governance Taxonomy (MIGT) for AI Systems Operating Across Enterprise and Geopolitical Boundaries

Andrew Kurtz, Klaudia Krawiecka

Comments 75 pages (excl. references), 2 tables. Addresses policy makers, regulators, and practitioners at the intersection of AI governance, cybersecurity, and geopolitical risk

2604.06135 2026-04-08 quant-ph cs.AI cs.LG

Shot-Based Quantum Encoding: A Data-Loading Paradigm for Quantum Neural Networks

Basil Kyriacou, Viktoria Patapovich, Maniraman Periyasamy, Alexey Melnikov

Comments 6 pages, 2 figures, 0 tables

2604.06123 2026-04-08 stat.CO cs.LG econ.EM stat.ME

A Large-Scale Empirical Comparison of Meta-Learners and Causal Forests for Heterogeneous Treatment Effect Estimation in Marketing Uplift Modeling

Aman Singh

Comments 6 pages

2604.06095 2026-04-08 cs.CR cs.AI

LLM4CodeRE: Generative AI for Code Decompilation Analysis and Reverse Engineering

Hamed Jelodar, Samita Bai, Tochukwu Emmanuel Nwankwo, Parisa Hamedi, Mohammad Meymani, Roozbeh Razavi-Far, Ali A. Ghorbani

2604.06093 2026-04-08 eess.SY cs.LG cs.RO cs.SY

eVTOL Aircraft Energy Overhead Estimation under Conflict Resolution in High-Density Airspaces

Alex Zongo, Peng Wei

Comments Accepted for presentation at the Integrated Communications, Navigation and Surveillance Conference (ICNS) 2026

2604.06058 2026-04-08 eess.SY cs.RO cs.SY

Staggered Integral Online Conformal Prediction for Safe Dynamics Adaptation with Multi-Step Coverage Guarantees

Daniel M. Cherenson, Dimitra Panagou

Comments Submitted to CDC 2026

2604.06039 2026-04-08 math.OC cs.LG math.PR

Value Mirror Descent for Reinforcement Learning

Zhichao Jia, Guanghui Lan

详情

英文摘要

Value iteration-type methods have been extensively studied for computing a nearly optimal value function in reinforcement learning (RL). Under a generative sampling model, these methods can achieve sharper sample complexity than policy optimization approaches, particularly in their dependence on the discount factor. In practice, they are often employed for offline training or in simulated environments. In this paper, we consider discounted Markov decision processes with state space S, action space A, discount factor $γ\in(0,1)$ and costs in $[0,1]$. We introduce a novel value optimization method, termed value mirror descent (VMD), which integrates mirror descent from convex optimization into the classical value iteration framework. In the deterministic setting with known transition kernels, we show that VMD converges linearly. For the stochastic setting with a generative model, we develop a stochastic variant, SVMD, which incorporates variance reduction commonly used in stochastic value iteration-type methods. For RL problems with general convex regularizers, SVMD attains a near-optimal sample complexity of $\tilde{O}(|S||A|(1-γ)^{-3}ε^{-2})$. Moreover, we establish that the Bregman divergence between the generated and optimal policies remains bounded throughout the iterations. This property is absent in existing stochastic value iteration-type methods but is important for enabling effective online (continual) learning following offline training. Under a strongly convex regularizer, SVMD achieves sample complexity of $\tilde{O}(|S||A|(1-γ)^{-5}ε^{-1})$, improving performance in the high-accuracy regime. Furthermore, we prove convergence of the generated policy to the optimal policy. Overall, the proposed method, its analysis, and the resulting guarantees, constitute new contributions to the RL and optimization literature.

URL PDF HTML ☆

赞 0 踩 0

2604.06032 2026-04-08 stat.ML cs.LG

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

Courtney Franzen, Farhad Pourkamali-Anaraki

Comments 48 pages

2604.06019 2026-04-08 cs.CR cs.AI

CritBench: A Framework for Evaluating Cybersecurity Capabilities of Large Language Models in IEC 61850 Digital Substation Environments

Gustav Keppler, Moritz Gstür, Veit Hagenmeyer

Comments 16 pages, 4 figures, 3 tables. Submitted to the 3rd ACM SIGEnergy Workshop on Cybersecurity and Privacy of Energy Systems (ACM EnergySP '26)

2604.06001 2026-04-08 physics.comp-ph cs.LG

A deep learning framework for jointly solving transient Fokker-Planck equations with arbitrary parameters and initial distributions

Xiaolong Wang, Jing Feng, Qi Liu, Chengli Tan, Yuanyuan Liu, Yong Xu

2604.05969 2026-04-08 cs.CR cs.AI

A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms

Nirajan Acharya, Gaurav Kumar Gupta

详情

英文摘要

The Model Context Protocol (MCP), introduced by Anthropic in November 2024 and now governed by the Linux Foundation's Agentic AI Foundation, has rapidly become the de facto standard for connecting large language model (LLM)-based agents to external tools and data sources, with over 97 million monthly SDK downloads and more than 177000 registered tools. However, this explosive adoption has exposed a critical gap: the absence of a unified, formal security framework capable of systematically characterizing, analyzing, and mitigating the diverse threats facing MCP-based agent ecosystems. Existing security research remains fragmented across individual attack papers, isolated benchmarks, and point defense mechanisms. This paper presents MCPSHIELD, a comprehensive formal security framework for MCP-based AI agents. We make four principal contributions: (1) a hierarchical threat taxonomy comprising 7 threat categories and 23 distinct attack vectors organized across four attack surfaces, grounded in the analysis of over 177000 MCP tools; (2) a formal verification model based on labeled transition systems with trust boundary annotations that enables static and runtime analysis of MCP tool interaction chains; (3) a systematic comparative evaluation of 12 existing defense mechanisms, identifying coverage gaps across our threat taxonomy; and (4) a defense in depth reference architecture integrating capability based access control, cryptographic tool attestation, information flow tracking, and runtime policy enforcement. Our analysis reveals that no existing single defense covers more than 34 percent of the identified threat landscape, whereas MCPSHIELD's integrated architecture achieves theoretical coverage of 91 percent. We further identify seven open research challenges that must be addressed to secure the next generation of agentic AI systems.

URL PDF HTML ☆

赞 0 踩 0

2604.05963 2026-04-08 cs.SE cs.LG

QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization

Changxin Ke, Rui Zhang, Jiaming Guo, Yuanbo Wen, Li Ding, Shuo Wang, Xuyuan Zhu, Xiong Peng, Di Huang, Zidong Du, Xing Hu, Qi Guo, Yunji Chen

Comments Accepted to ACL 2026 main conference

2604.05955 2026-04-08 cs.SE cs.AI

Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance in LLM-based Issue Resolution

Kai Yu, Zhenhao Zhou, Junhao Zeng, Ying Wang, Xueying Du, Zhiqiang Yuan, Junwei Liu, Ziyu Zhou, Yujia Wang, Chong Wang, Xin Peng

2604.05953 2026-04-08 cs.GT cs.AI cs.DS cs.MA

Polynomial-Time Algorithm for Thiele Voting Rules with Voter Interval Preferences

Pasin Manurangsi, Krzysztof Sornat

Comments 30 pages

2604.05904 2026-04-08 eess.SY cs.LG cs.SY

Transfer Learning for Neural Parameter Estimation applied to Building RC Models

Fabian Raisch, Timo Germann, J. Nathan Kutz, Christoph Goebel, Benjamin Tischler

Comments This work has been submitted to the IEEE for possible publication

2604.05890 2026-04-08 cs.IT cs.LG math.IT

A Tensor-Train Framework for Bayesian Inference in High-Dimensional Systems: Applications to MIMO Detection and Channel Decoding

Luca Schmid, Dominik Sulz, Shrinivas Chimmalgi, Laurent Schmalen

2604.05872 2026-04-08 cs.CR cs.AI cs.CL

Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts

Fatih Uenal

Comments 23 pages, 5 figures, 8 tables

2604.05866 2026-04-08 cs.IR cs.CL cs.DL

Beyond Paper-to-Paper: Structured Profiling and Rubric Scoring for Paper-Reviewer Matching

Yicheng Pan, Zhiyuan Ning, Ludi Wang, Yi Du

Comments Accepted by IJCNN-2026