Bridging the Smart City Cybersecurity Data Gap Through AI-Driven Synthetic Dataset Generation
弥合智慧城市网络安全数据鸿沟:基于AI驱动的合成数据集生成
Stephanie Polczynski, John D. Hastings, Varghese Vaidyan, Kyle Korman
AI总结 提出AI合成数据生成框架,利用生成模型产生高保真网络安全数据集,解决真实数据稀缺问题,支持智慧城市安全工具开发与评估。
详情
- Comments
- 10 pages, 1 figure, 2 tables
智慧城市依赖于互联的网络物理系统,这些系统集成了传感器、物联网设备、云平台以及AI驱动的服务和决策。虽然这些系统增强了城市服务,但由于其庞大的攻击面、异构的数据流和不断演变的威胁向量,也引入了复杂的网络安全挑战。为智慧城市开发和验证网络安全工具需要能够准确代表真实运行条件的高质量数据集。然而,真实世界的数据集往往不完整、包含隐私敏感数据、难以获取,或者缺乏足够的恶意活动来支持工具开发。本研究通过提出一个专门为智慧城市网络安全研究设计的基于AI的合成数据生成(SDG)框架,解决了这一关键差距。所提出的框架利用生成式人工智能模型来生成高保真的合成网络安全数据集,这些数据集复制了真实的设备行为、网络交互和网络攻击场景。合成数据集根据协议标准的一致性、与原始数据集的统计相似性以及在常见安全工具中的实用性进行评估。由此产生的合成数据生成框架和评估指标有望通过使研究人员能够更有效地建模威胁和更全面地评估防御技术,从而推进智慧城市网络安全,更好地保护关键智慧城市基础设施。
Smart cities rely on interconnected cyber-physical systems that integrate sensors, IoT devices, cloud platforms, and AI-driven services and decision-making. While these systems enhance city services, they also introduce complex cybersecurity challenges due to their large attack surfaces, heterogeneous data flows, and evolving threat vectors. Developing and validating cybersecurity tools for smart cities requires high-quality datasets that accurately represent real operational conditions. However, real-world datasets are often incomplete, contain privacy-sensitive data, are difficult to access, or lack sufficient malicious activity to support tool development. This research addresses this critical gap by proposing an AI-based synthetic data generation (SDG) framework designed specifically for smart city cybersecurity research. The proposed framework leverages generative artificial intelligence models to produce high-fidelity synthetic cybersecurity datasets that replicate realistic device behaviors, network interactions, and cyber-attack scenarios. The synthetic datasets are evaluated for conformity to protocol standards, statistical similarity to original datasets, and utility in common security tools. The resulting synthetic data generation framework and evaluation metrics are expected to advance smart city cybersecurity by enabling researchers to model threats more effectively and evaluate defensive techniques more comprehensively to better protect critical smart city infrastructures.