FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction
FuzzingBrain V2: 一个用于自动化漏洞发现与重现的多智能体LLM系统
Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu, Jeff Huang
AI总结 本文提出FuzzingBrain V2,通过多智能体系统解决LLM在漏洞检测中的三大挑战:自动化分析、精准定位与复杂依赖推理,实现了90%的检测率并发现29个零日漏洞。
详情
软件漏洞带来严重安全威胁,2025年报告了近5万项CVE。尽管大语言模型(LLM)在自动化漏洞检测中显示出潜力,但仍存在三个关键挑战。首先,LLM生成的漏洞报告具有高误报率且缺乏可重复验证性。其次,现有基于LLM的方法在漏洞定位时使用了次优的粒度:函数级分析在上下文广泛时会遗漏漏洞,而行级分析则缺乏足够的上下文。第三,现有方法难以推理具有复杂跨函数依赖性和触发条件的漏洞。我们提出了FuzzingBrain V2,一个通过四个关键贡献解决这些差距的多智能体系统:(1)基于Google的OSS-Fuzz构建的完全自动化漏洞分析,确保所有报告的漏洞都能被模糊器重现;(2)Suspicious Point,一种基于控制流的抽象方法,用于在最优粒度下精确定位漏洞;(3)逻辑驱动的分层函数分析,通过双层模糊化在资源约束下增强函数覆盖率;(4)基于MCP的静态和动态分析工具,结合上下文工程增强复杂漏洞推理。在AIxCC 2025最终竞赛C/C++数据集上,FuzzingBrain V2实现了90%的检测率(36/40个漏洞)。在实际部署中,FuzzingBrain V2在12个开源项目中发现了29个零日漏洞,所有漏洞均被维护者确认并修复,其中2个被分配了CVE ID。
Software vulnerabilities pose critical security threats, with nearly 50,000 CVEs reported in 2025. While Large Language Models (LLMs) show promise for automated vulnerability detection, three key challenges remain. First, LLM-generated vulnerability reports suffer from high false positive rates and lack reproducible verification. Second, existing LLM-based approaches use suboptimal granularities for vulnerability localization: function-level analysis overlooks bugs when context becomes extensive, while line-level analysis lacks sufficient context. Third, existing approaches have difficulty reasoning about vulnerabilities with complex cross-function dependencies and triggering conditions. We present FuzzingBrain V2, a multi-agent system that addresses these gaps through four key contributions: (1) fully automated vulnerability analysis built on Google's OSS-Fuzz, ensuring all reported vulnerabilities are fuzzer-reproducible; (2) Suspicious Point, a novel control-flow-based abstraction for precise vulnerability localization at the optimal granularity; (3) logic-driven hierarchical function analysis with dual-layer fuzzing enhancing function coverage under resource constraints; (4) MCP-based static and dynamic analysis tools with context engineering enhancing complex vulnerability reasoning. On the AIxCC 2025 Final Competition C/C++ dataset, FuzzingBrain V2 achieved 90% detection rate (36 of 40 vulnerabilities). In real-world deployment, FuzzingBrain V2 discovered 29 zero-day vulnerabilities across 12 open-source projects, all confirmed and fixed by maintainers, with 2 assigned CVE IDs.