2605.29801
2026-05-29
cs.AI
cs.CL
cs.CR
cs.CV
cs.LG
版本更新
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security
AgentDoG 1.5:一种轻量级且可扩展的AI智能体安全与安保对齐框架
Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang, Guanxu Chen, Yuejin Xie, Qinghua Mao, Wanying Qu, Yanxu Zhu, Tianyi Zhou, Leitao Yuan, Zhijie Zheng, Qihao Lin, Yimin Wang, Haoyu Luo, Shuai Shao, Chen Qian, Qingyu Liu, Ling Tang, Ruiyang Qin, Qihan Ren, Junxiao Yang, Kun Wang, Zhiheng Xi, Linfeng Zhang, Ranjie Duan, Bo Zhang, Wenjie Wang, Wen Shen, Qiaosheng Zhang, Yan Teng, Chaochao Lu, Rui Mei, Man Li, Jialing Tao, Xi Lin, Tianhang Zheng, Yong Liu, Quanshi Zhang, Lei Zhu, Xingjun Ma, Junhua Liu, Hui Xue, Xiaoxiang Zuo, Xiangnan He, Chao Shen, Xianglong Liu, Minlie Huang, Jing Shao, Xia Hu
发表机构
*
Shanghai Artificial Intelligence Laboratory(上海人工智能实验室)
AI总结
针对开放世界智能体的新兴安全风险,提出一种轻量级可扩展的安全对齐框架,通过更新安全分类法、构建数据引擎并训练小模型(0.8B-8B参数),实现与闭源模型相当的性能,并降低部署开销两个数量级。