arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.18716 2026-04-22 cs.CR cs.LG

TrEEStealer: Stealing Decision Trees via Enclave Side Channels

Jonas Sander, Anja Rabich, Nick Mahling, Felix Maurer, Jonah Heller, Qifan Wang, Thomas Eisenbarth, David Oswald

详情

英文摘要

Today, machine learning is widely applied in sensitive, security-related, and financially lucrative applications. Model extraction attacks undermine current business models where a model owner sells model access, e.g., via MLaaS APIs. Additionally, stolen models can enable powerful white-box attacks, facilitating privacy attacks on sensitive training data, and model evasion. In this paper, we focus on Decision Trees (DT), which are widely deployed in practice. Existing black-box extraction attacks for DTs are either query-intensive, make strong assumptions about the DT structure, or rely on rich API information. To limit attacks to the black-box setting, CPU vendors introduced Trusted Execution Environments (TEE) that use hardware-mechanisms to isolate workloads from external parties, e.g., MLaaS providers. We introduce TrEEStealer, a high-fidelity extraction attack for stealing TEE-protected DTs. TrEEStealer exploits TEE-specific side-channels to steal DTs efficiently and without strong assumptions about the API output or DT structure. The extraction efficacy stems from a novel algorithm that maximizes the information derived from each query by coupling Control-Flow Information (CFI) with passive information tracking. We use two primitives to acquire CFI: for AMD SEV, we follow previous work using the SEV-Step framework and performance counters. For Intel SGX, we reproduce prior findings on current Xeon 6 CPUs and construct a new primitive to efficiently extract the branch history of inference runs through the Branch-History-Register. We found corresponding vulnerabilities in three popular libraries: OpenCV, mlpack, and emlearn. We show that TrEEStealer achieves superior efficiency and extraction fidelity compared to prior attacks. Our work establishes a new state-of-the-art for DT extraction and confirms that TEEs fail to protect against control-flow leakage.

URL PDF HTML ☆

赞 0 踩 0

2604.18697 2026-04-22 cs.CR cs.CL cs.LG

Beyond Indistinguishability: Measuring Extraction Risk in LLM APIs

Ruixuan Liu, David Evans, Li Xiong

Comments Accepted by S&P 2026

2604.18663 2026-04-22 cs.CR cs.AI

Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

Wentao Zhang, Yan Zhuang, ZhuHang Zheng, Mingfei Zhang, Jiawen Deng, Fuji Ren

Comments 22 pages, Accepted to the ACL 2026 Main Conference

2604.18660 2026-04-22 cs.CR cs.AI

Evaluating Answer Leakage Robustness of LLM Tutors against Adversarial Student Attacks

Jin Zhao, Marta Knežević, Tanja Käser

Comments ACL 2026

2604.18658 2026-04-22 cs.CR cs.AI cs.CL

Owner-Harm: A Missing Threat Model for AI Agent Safety

Dongcheng Zhang, Yiqing Jiang

Comments 15 pages. Companion manuscript on per-decision proof-obligation synthesis (LSVJ-S) in preparation

2604.18649 2026-04-22 cs.CR cs.AI

Position: No Retroactive Cure for Infringement during Training

Satoru Utsunomiya, Masaru Isonuma, Junichiro Mori, Ichiro Sakata

Comments 12pages

2604.18621 2026-04-22 q-bio.GN cs.LG

Quantum AI for Cancer Diagnostic Biomarker Discovery

Mandeep Kaur Saggi, Amandeep Singh Bhatia, Humaira Gowher, Sabre Kais

Comments 25 pages, 15 figures

2604.18616 2026-04-22 cs.DC cs.AI cs.PL

ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants

Haohui Mai, Xiaoyan Guo, Xiangyun Ding, Daifeng Li, Qiuchu Yu, Chenzhun Guo, Cong Wang, Jiacheng Zhao, Christos Kozyrakis, Binhang Yuan

详情

英文摘要

LLM-based coding agents can generate functionally correct GPU kernels, yet their performance remains far below hand-optimized libraries on critical computations such as matrix multiplication, attention, and Mixture-of-Experts (MoE). Peak GPU performance requires coordinated reasoning over tightly coupled optimizations, including tiling, shared-memory staging, software pipelining, and instruction scheduling, while existing agents rely on sparse pass/fail feedback, leaving them unable to diagnose global constraint violations. We present Argus, an agentic framework that addresses this through data-flow invariants: compile-time specifications encoding how data must be choreographed throughout kernel execution. Argus introduces a tile-based, Pythonic DSL exposing hardware instructions and compiler policies while hiding low-level representations. The DSL provides tag functions to propagate symbolic annotations through data and control flow, and tag assertions to enforce relational constraints at use sites. When violations occur, the compiler returns concrete counterexamples identifying the thread, data element, and program point, enabling dense, structured feedback for targeted fixes. Invariants are verified at compile time via abstract interpretation over a layout algebra and SMT solving, with zero runtime overhead. An in-context reinforcement learning planner learns to select optimizations and synthesize effective invariants, supported by a curated knowledge base of GPU optimization techniques. We evaluate Argus on the AMD MI300X GPU across GEMM, flash attention, and MoE kernels accounting for over 90% of GPU time in LLM inference. Generated kernels achieve 99-104% of state-of-the-art hand-optimized assembly throughput and are 2-1543x faster than existing agentic systems. Argus further generalizes to 200 KernelBench tasks, solving 100% of Level 1 and 90% of Level 2 problems.

URL PDF HTML ☆

赞 0 踩 0

2604.18612 2026-04-22 cs.NE cs.AI cs.LG

Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models

Xudong Wang, Chaoning Zhang, Chenghao Li, Shuxu Chen, Qigan Sun, Jiaquan Zhang, Fachrina Dewi Puspitasari, Tae-Ho Kim, Jiwei Wei, Malu Zhang, Guoqing Wang, Yang Yang, Heng Tao Shen

Comments Accepted to ACL 2026. 9 pages, 5 figures

2604.18611 2026-04-22 cs.NE cs.AI cs.LG

Neuromorphic Continual Learning for Sequential Deployment of Nuclear Plant Monitoring Systems

Samrendra Roy, Sajedul Talukder, Syed Bahauddin Alam

2604.18610 2026-04-22 cs.NE cs.AI

SpikeMLLM: Spike-based Multimodal Large Language Models via Modality-Specific Temporal Scales and Temporal Compression

Han Xu, Zhiyong Qin, Di Shang, Jiahong Zhang, Xuerui Qiu, Bo Lei, Tiejun Huang, Bo Xu, Guoqi Li

2604.18607 2026-04-22 cs.NE cs.AI

TurboEvolve: Towards Fast and Robust LLM-Driven Program Evolution

Yang Yang, Zining Zhong, Jindong Li, Jiemin Wu, Kaishen Yuan, Wenshuo Chen, Menglin Yang, Yutao Yue

Comments 12 pages, 8 figures

2604.18606 2026-04-22 eess.SP cs.AI

Thermal Anomaly Detection using Physics Aware Neuromorphic Networks: Comparison between Raw and L1C Sentinel-2 Data

Stephen Smith, Cormac Purcell, Gabriele Meoni, Roberto Del Prete, Zdenka Kuncic

Comments 9 pages, 5 figures

2604.18591 2026-04-22 cs.HC cs.AI

SPRITE: From Static Mockups to Engine-Ready Game UI

Yunshu Bai, RuiHao Li, Hao Zhang, Chien Her Lim, Ming Yan, Mengtian Li

Comments CHI EA '26

2604.18589 2026-04-22 cs.HC cs.AI

CentaurTA Studio: A Self-Improving Human-Agent Collaboration System for Thematic Analysis

Lei Wang, Min Huang, Eduard Dragut

2604.18586 2026-04-22 cs.CY cs.AI cs.CL cs.LG cs.SI

Who Shapes Brazil's Vaccine Debate? Semi-Supervised Modeling of Stance and Polarization in YouTube's Media Ecosystem

Geovana S. de Oliveira, Ana P. C. Silva, Fabricio Murai, Carlos H. G. Ferreira

Comments Paper accepted at WebSci'26

2604.18507 2026-04-22 math.OC cs.AI cs.LG

Learning the Riccati solution operator for time-varying LQR via Deep Operator Networks

Jun Chen, Umberto Biccari, Junmin Wang

2604.18146 2026-04-22 cs.IR cs.AI cs.CL

Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations

Yunjia Xi, Menghui Zhu, Jianghao Lin, Bo Chen, Ruiming Tang, Yong Yu, Weinan Zhang

Comments SIGIR 2026

2604.18005 2026-04-22 cs.MA cs.AI cs.CL

Diversity Collapse in Multi-Agent LLM Systems: Structural Coupling and Collective Failure in Open-Ended Idea Generation

Nuo Chen, Yicheng Tong, Yuzhe Yang, Yufei He, Xueyi Zhang, Qingyun Zou, Qian Wang, Bingsheng He

Comments 56 pages, 15 figures; Accepted at ACL 2026 Findings

2604.16369 2026-04-22 cs.CY cs.AI cs.CL

Why AI Readiness Is an Organizational Learning Problem, Not a Technology Purchase

Jeanne McClure, Gregg Gerdau

Comments 8 Pages 2 figures 1 table

2604.13327 2026-04-22 cs.DC cs.LG cs.PL

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel

Hongyi Jin, Bohan Hou, Guanjie Wang, Ruihang Lai, Jinqi Chen, Zihao Ye, Yaxing Cai, Yixin Dong, Xinhao Cheng, Zhihao Zhang, Yilong Zhao, Yingyi Huang, Lijie Yang, Jinchen Jiang, Gabriele Oliaro, Jianan Ji, Xupeng Miao, Vinod Grover, Todd C. Mowry, Zhihao Jia, Tianqi Chen

Comments 16 pages. 18 figures. accepted in MLSys 2026. References corrected

2604.08608 2026-04-22 cs.CR cs.AI cs.LG

Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala, Yoonpyo Lee, Syed Bahauddin Alam, Sajedul Talukder

Comments This paper got accepted for AAAI 2026 Summer Symposium

2604.04981 2026-04-22 q-bio.GN cs.LG cs.NE

An Imbalanced Dataset with Multiple Feature Representations for Studying Quality Control of Next-Generation Sequencing

Philipp Röchner, Clarissa Krämer, Johannes U Mayer, Franz Rothlauf, Steffen Albrecht, Maximilian Sprang

2604.03753 2026-04-22 cs.CR cs.LG

Spatiotemporal-Aware Bit-Flip Injection on DNN-based Advanced Driver Assistance Systems (extended version)

Taibiao Zhao, Xiang Zhang, Mingxuan Sun, Ruyi Ding, Xugui Zhou

Comments The authors have identified issues in the experimental setup and evaluation that may affect the validity of the results. In particular, inconsistencies in the fault injection protocol and temporal analysis may lead to incorrect conclusions. The authors therefore request withdrawal for thorough revision

2603.20897 2026-04-22 cs.CY cs.AI cs.AR

The data heat island effect: quantifying the impact of AI data centers in a warming world

Andrea Marinoni, Erik Cambria, Weisi Lin, Mauro Dalla Mura, Jocelyn Chanussot, Edoardo Ragusa, Chi Yan Tso, Yihao Zhu, Benjamin Horton

2603.04445 2026-04-22 cs.NI cs.CL cs.PF

Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey

Yasmin Moslem, John D. Kelleher

Comments Work funded by ADAPT Centre, Trinity College Dublin, and Huawei Ireland

详情

英文摘要

The rapid growth of large language models (LLMs) with diverse capabilities, costs, and domains has created a critical need for intelligent model selection at inference time. While smaller models suffice for routine queries, complex tasks demand more capable models. However, static model deployment does not account for the complexity and domain of incoming queries, leading to suboptimal performance and increased costs. Dynamic routing systems that adaptively select models based on query characteristics have emerged as a solution to this challenge. We provide a systematic analysis of state-of-the-art multi-LLM routing and cascading approaches. In contrast to mixture-of-experts architectures, which route within a single model, we study routing across multiple independently trained LLMs. We cover diverse routing paradigms, including query difficulty, human preferences, clustering, uncertainty quantification, reinforcement learning, multimodality, and cascading. For each paradigm, we analyze representative methods and examine key trade-offs. Beyond taxonomy, we introduce a conceptual framework that characterizes routing systems along three dimensions: when decisions are made, what information is used, and how they are computed. This perspective highlights that practical systems are often compositional, integrating multiple paradigms under operational constraints. Our analysis demonstrates that effective multi-LLM routing requires balancing competing objectives. Choosing the optimal routing strategy depends on deployment and computational constraints. Well-designed routing systems can outperform even the most powerful individual models by strategically leveraging specialized capabilities across models while maximizing efficiency gains. Meanwhile, open challenges remain in developing routing mechanisms that generalize across diverse architectures, modalities, and applications.

URL PDF HTML ☆

赞 0 踩 0

2602.18571 2026-04-22 cs.SE cs.AI

Debug2Fix: Can Interactive Debugging Help Coding Agents Fix More Bugs?

Spandan Garg, Yufan Huang

Comments In Review

2602.17036 2026-04-22 cs.IR cs.LG

LiveGraph: Active-Structure Neural Re-ranking for Exercise Recommendation

Rong Fu, Zijian Zhang, Haiyun Wei, Jiekai Wu, Kun Liu, Xianda Li, Haoyu Zhao, Yang Li, Yongtai Liu, Ziming Wang, Rui Lu, Simon Fong

Comments 19 pages, 5 figures

2602.15423 2026-04-22 cs.IR cs.LG

GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search

Rong Fu, Jia Yee Tan, Chunlei Meng, Shuo Yin, Xiaowen Ma, Wangyu Wu, Muge Qi, Guangzhen Yao, Zhaolu Kang, Zeli Su, Simon Fong

Comments 19 pages, 7 figures

2602.11631 2026-04-22 physics.geo-ph cs.LG

Enforcing Reciprocity in Operator Learning for Seismic Wave Propagation

Caifeng Zou, Yaozhong Shi, Zachary E. Ross, Robert W. Clayton, Kamyar Azizzadenesheli