2605.14355
2026-06-02
cs.AI
cs.CL
版本更新
Herculean: An Agentic Benchmark for Financial Intelligence
Herculean: 面向金融智能的智能体基准测试
Xueqing Peng, Zhuohan Xie, Yupeng Cao, Haohang Li, Lingfei Qian, Yan Wang, Vincent Jim Zhang, Huan He, Xuguang Ai, Linhai Ma, Ruoyu Xiang, Yueru He, Yi Han, Shuyao Wang, Yuqing Guo, Mingyang Jiang, Yilun Zhao, Youzhong Dong, Xiaoyu Wang, Yankai Chen, Ye Yuan, Qiyuan Zhang, Fuyuan Lyu, Haolun Wu, Yonghan Yang, Zichen Zhao, Yuyang Dai, Fan Zhang, Rania Elbadry, Ayesha Gull, Muhammad Usman Safder, Nuo Chen, Fengbin Zhu, Tianshi Cai, Zimu Wang, Polydoros Giannouris, Yuechen Jiang, Zhiwei Liu, Mohsinul Kabir, Yuyan Wang, Yixiang Zheng, Yangyang Yu, Weijin Liu, Wenbo Cao, Anke Xu, Peng Lu, Jerry Huang, Mingquan Lin, Prayag Tiwari, Yijia Zhao, Víctor Gutiérrez-Basulto, Xiao-Yang Liu, Kaleb E Smith, Jiahuan Pei, Arman Cohan, Jimin Huang, Yuehua Tang, Alejandro Lopez-Lira, Xi Chen, Xue Liu, Junichi Tsujii, Jian-Yun Nie, Sophia Ananiadou
发表机构
*
The Fin AI
;
Yale University(耶鲁大学)
;
Columbia University(哥伦比亚大学)
;
Stevens Institute of Technology(史蒂文斯理工学院)
;
NVIDIA(英伟达)
;
New York University(纽约大学)
;
Georgia Institute of Technology(佐治亚理工学院)
;
University of Florida(佛罗里达大学)
;
MBZUAI
;
Université de Montréal(蒙特利尔大学)
;
University of Minnesota(明尼苏达大学)
;
University of Massachusetts Boston(马萨诸塞大学波士顿分校)
;
National Institute of Advanced Industrial Science and Technology(国家先进工业科学与技术研究院)
;
University of Liverpool(利物浦大学)
;
Vrije Universiteit Amsterdam(阿姆斯特丹自由大学)
;
National University of Singapore(新加坡国立大学)
;
Halmstad University(哈尔姆斯塔德大学)
;
University of Manchester(曼彻斯特大学)
;
Cardiff University(卡迪夫大学)
;
McGill University(麦吉尔大学)
;
Mila – Quebec AI Institute(魁北克人工智能研究所)
AI总结
本文提出Herculean,首个覆盖交易、对冲、市场洞察和审计四个代表性工作流的智能体金融智能基准测试,通过标准化MCP技能环境评估异构智能体系统,发现智能体在交易和市场洞察上表现较好,但在对冲和审计等需要长期协调、状态一致性和结构化验证的任务上存在显著不足。