arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.05719 2026-04-08 cs.CR cs.AI cs.SE

Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing

Jiaren Peng, Zeqin Li, Chang You, Yan Wang, Hanlin Sun, Xuan Tian, Shuqiao Zhang, Junyi Liu, Jianguo Zhao, Renyang Liu, Haoran Ou, Yuqiang Sun, Jiancheng Zhang, Yutong Jiao, Kunshu Song, Chao Zhang, Fan Shi, Hongda Sun, Rui Yan, Cheng Huang

详情

英文摘要

The rapid advancement of Large Language Models (LLMs) has created new opportunities for Automated Penetration Testing (AutoPT), spawning numerous frameworks aimed at achieving end-to-end autonomous attacks. However, despite the proliferation of related studies, existing research generally lacks systematic architectural analysis and large-scale empirical comparisons under a unified benchmark. Therefore, this paper presents the first Systematization of Knowledge (SoK) focusing on the architectural design and comprehensive empirical evaluation of current LLM-based AutoPT frameworks. At systematization level, we comprehensively review existing framework designs across six dimensions: agent architecture, agent plan, agent memory, agent execution, external knowledge, and benchmarks. At empirical level, we conduct large-scale experiments on 13 representative open-source AutoPT frameworks and 2 baseline frameworks utilizing a unified benchmark. The experiments consumed over 10 billion tokens in total and generated more than 1,500 execution logs, which were manually reviewed and analyzed over four months by a panel of more than 15 researchers with expertise in cybersecurity. By investigating the latest progress in this rapidly developing field, we provide researchers with a structured taxonomy to understand existing LLM-based AutoPT frameworks and a large-scale empirical benchmark, along with promising directions for future research.

URL PDF HTML ☆

赞 0 踩 0

2604.05711 2026-04-08 cs.SE cs.AI cs.CL cs.IR

SemLink: A Semantic-Aware Automated Test Oracle for Hyperlink Verification using Siamese Sentence-BERT

Guan-Yan Yang, Wei-Ling Wen, Shu-Yuan Ku, Farn Wang, Kuo-Hui Yeh

Comments Accepted at the 19th IEEE International Conference on Software Testing, Verification and Validation (ICST) 2026, Daejeon, Republic of Korea

2604.05707 2026-04-08 physics.med-ph cs.LG

Untargeted analysis of volatile markers of post-exercise fat oxidation in exhaled breath

André Homeyer, Júlia Blanka Sziládi, Jan-Philipp Redlich, Jonathan Beauchamp, Y Lan Pham

2604.05678 2026-04-08 math.OC cs.LG math.FA

Intrinsic perturbation scale for certified oracle objectives with epigraphic information

Karim Bounja, Boujemaâ Achchab, Abdeljalil Sakat

2604.05674 2026-04-08 cs.CR cs.AI

From Incomplete Architecture to Quantified Risk: Multimodal LLM-Driven Security Assessment for Cyber-Physical Systems

Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar

Comments Under submission

2604.05669 2026-04-08 stat.ML cs.LG

Efficient machine unlearning with minimax optimality

Jingyi Xie, Linjun Zhang, Sai Li

2604.05652 2026-04-08 physics.flu-dyn cs.AI cs.LG

Multiscale Physics-Informed Neural Network for Complex Fluid Flows with Long-Range Dependencies

Prashant Kumar, Rajesh Ranjan

Comments 16 pages, 10 figures

2604.05640 2026-04-08 math.OC cs.LG cs.SY eess.SY

Parametric Nonconvex Optimization via Convex Surrogates

Renzi Wang, Panagiotis Patrinos, Alberto Bemporad

2604.05605 2026-04-08 cs.CE cs.AI cs.CL cs.CV cs.ET

INTERACT: An AI-Driven Extended Reality Framework for Accesible Communication Featuring Real-Time Sign Language Interpretation and Emotion Recognition

Nikolaos D. Tantaroudas, Andrew J. McCracken, Ilias Karachalios, Evangelos Papatheou

Comments 20

2604.05591 2026-04-08 cs.CE cs.AI cs.CL cs.CY cs.ET

AI-Driven Modular Services for Accessible Multilingual Education in Immersive Extended Reality Settings: Integrating Speech Processing, Translation, and Sign Language Rendering

N. D. Tantaroudas, A. J. McCracken, I. Karachalios, E. Papatheou

Comments 21

2604.05589 2026-04-08 cs.CR cs.AI

Foundations for Agentic AI Investigations from the Forensic Analysis of OpenClaw

Jan Gruber, Jan-Niclas Hilgert

Comments Preprint. Code and experimental data available at: https://github.com/jgru/forensic-analysis-of-openclaw

2604.05520 2026-04-08 eess.SP cs.AI

Learned Elevation Models as a Lightweight Alternative to LiDAR for Radio Environment Map Estimation

Ljupcho Milosheski, Fedja Močnik, Mihael Mohorčič, Carolina Fortuna

Comments 6 pages, 3 figures, 3 tables Submitted to PIMRC 2026

2604.05519 2026-04-08 eess.AS cs.HC cs.LG cs.SD eess.SP

Active noise cancellation on open-ear smart glasses

Kuang Yuan, Freddy Yifei Liu, Tong Xiao, Yiwen Song, Chengyi Shen, Saksham Bhutani, Justin Chan, Swarun Kumar

2604.05518 2026-04-08 math.OC cs.LG cs.SY eess.SY stat.ML

Optimal Centered Active Excitation in Linear System Identification

Kaito Ito, Alexandre Proutiere

Comments 11 pages

2604.05502 2026-04-08 cs.CR cs.LG

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

Haobo Zhang, Zhenhua Xu, Junxian Li, Shangfeng Sheng, Dezhang Kong, Meng Han

Comments Accepted at ACL2026 Main

2604.05481 2026-04-08 cs.SE cs.AI

On the Role of Fault Localization Context for LLM-Based Program Repair

Melika Sepidband, Hung Viet Pham, Hadi Hemmati

Comments 30 pages, 8 figures

2604.05469 2026-04-08 stat.ME cs.LG stat.ML

Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

Giulio Valentino Dalla Riva

2604.05467 2026-04-08 cs.IR cs.CL cs.LG

CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation

Siddharth Jain, Venkat Narayan Vedam

Comments 6 figures, 14 tables; appendix includes bootstrap CIs, metric definitions, duplicate position sensitivity, prompt template, and reproducibility details

2604.05462 2026-04-08 stat.ML cs.LG math.ST stat.TH

Hierarchical Contrastive Learning for Multimodal Data

Huichao Li, Junhan Yu, Doudou Zhou

Comments 34 pages,11 figures

2604.05460 2026-04-08 stat.ME cs.AI

LLM Evaluation as Tensor Completion: Low Rank Structure and Semiparametric Efficiency

Jiachun Li, David Simchi-Levi, Will Wei Sun

2604.05458 2026-04-08 cs.CR cs.AI

MA-IDS: Multi-Agent RAG Framework for IoT Network Intrusion Detection with an Experience Library

Md Shamimul Islam, Luis G. Jaimes, Ayesha S. Dina

Comments Preprint. Submitted to IEEE conference

2604.05440 2026-04-08 cs.CR cs.AI

LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operations

Anes Abdennebi, Nadjia Kara, Laaziz Lahlou, Hakima Ould-Slimane

详情

英文摘要

Modern Security Operations Centers struggle with alert fatigue, fragmented tooling, and limited cross-source event correlation. Challenges that current Security Information Event Management and Extended Detection and Response systems only partially address through fragmented tools. This paper presents the LLM-assisted network Governance (LanG), an open-source, governance-aware agentic AI platform for unified security operations contributing: (i) a Unified Incident Context Record with a correlation engine (F1 = 87%), (ii) an Agentic AI Orchestrator on LangGraph with human-in-the-loop checkpoints, (iii) an LLM-based Rule Generator finetuned on four base models producing deployable Snort 2/3, Suricata, and YARA rules (average acceptance rate 96.2%), (iv) a Three-Phase Attack Reconstructor combining Louvain community detection, LLM-driven hypothesis generation, and Bayesian scoring (87.5% kill-chain accuracy), and (v) a layered Governance-MCP-Agentic AI-Security architecture where all tools are exposed via the Model Context Protocol, governed by an AI Governance Policy Engine with a two-layer guardrail pipeline (regex + Llama Prompt Guard 2 semantic classifier, achieving 98.1% F1 score with experimental zero false positives). Designed for Managed Security Service Providers, the platform supports multi-tenant isolation, role-based access, and fully local deployment. Finetuned anomaly and threat detectors achieve weighted F1 scores of 99.0% and 91.0%, respectively, in intrusion-detection benchmarks, running inferences in $\approx$21 ms with a machine-side mean time to detect of 1.58 s, and the rule generator exceeds 91% deployability on live IDS engines. A systematic comparison against eight SOC platforms confirms that LanG uniquely satisfies multiple industrial capabilities all in one open-source tool, while enforcing selected AI governance policies.

URL PDF HTML ☆

赞 0 踩 0

2604.05432 2026-04-08 cs.CR cs.AI

Your LLM Agent Can Leak Your Data: Data Exfiltration via Backdoored Tool Use

Wuyang Zhang, Shichao Pei

Comments The 64th Annual Meeting of the Association for Computational Linguistics

2604.05398 2026-04-08 math.OC cs.LG

An Actor-Critic Framework for Continuous-Time Jump-Diffusion Controls with Normalizing Flows

Liya Guo, Ruimeng Hu, Xu Yang, Yi Zhu

Comments 29 pages, 7 figures, 4 tables

2604.05387 2026-04-08 cs.IR cs.CL

Data-Driven Function Calling Improvements in Large Language Model for Online Financial QA

Xing Tang, Hao Chen, Shiwei Li, Fuyuan Lyu, Weijie Shi, Lingjie Li, Dugang Liu, Weihong Luo, Xiku Du, Xiuqiang He

Comments Accepted to Webconf 2026 industry track

2604.05379 2026-04-08 cs.IR cs.LG

Retrieve-then-Adapt: Retrieval-Augmented Test-Time Adaptation for Sequential Recommendation

Xing Tang, Jingyang Bin, Ziqiang Cui, Xiaokun Zhang, Fuyuan Lyu, Jingyan Jiang, Dugang Liu, Chen Ma, Xiuqiang He

2604.05368 2026-04-08 cs.HC cs.AI

AI and Collective Decisions: Strengthening Legitimacy and Losers' Consent

Suyash Fulay, Prerna Ravi, Emily Kubin, Shrestha Mohanty, Michiel Bakker, Deb Roy

Comments 11 pages + appendix

2604.05347 2026-04-08 eess.IV cs.CV cs.MM

CI-ICM: Channel Importance-driven Learned Image Coding for Machines

Yun Zhang, Junle Liu, Huan Zhang, Zhaoqing Pan, Gangyi Jiang, Weisi Lin

2604.05337 2026-04-08 stat.ML cs.LG

Individual-heterogeneous sub-Gaussian Mixture Models

Huan Qing

Comments 32 pages, 4 figures, 2 tables

2604.05285 2026-04-08 stat.ME cs.LG

Robust Learning of Heterogeneous Dynamic Systems

Shuoxun Xu, Zijian Guo, Brooke R. Staveland, Robert T. Knight, Lexin Li