AI 大模型

大模型推理能力

大模型数学、逻辑、规划、多步推理和测试时计算能力。

今日/当前日期收录 3 篇信号源：cs.CL, cs.AI, cs.LG

1. 数学推理 1 篇

2605.20531 2026-06-19 cs.LO cs.LG 版本更新专题 80

Pseudo-Formalization for Automatic Proof Verification

伪形式化用于自动证明验证

Slim Barkallah, Luke Bailey, Kaiyue Wen, Mohammed Abouzaid, Tengyu Ma

专题命中数学推理：伪形式化用于自动证明验证

AI总结本文提出了一种名为伪形式化的证明格式，该格式在保持自然语言灵活性的同时，保留了形式证明的模块性和精确性，通过块验证算法实现了对自然语言证明的高效验证，其在错误发现的精度和召回率上优于现有基线方法。

Comments 31 pages, code available at https://github.com/Slim205/pseudo-formalization

URL PDF HTML

2. 复杂问题求解 2 篇

2305.14985 2026-06-19 cs.CV cs.CL 版本更新专题 65

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

IdealGPT: 通过大型语言模型迭代分解视觉与语言推理

Haoxuan You, Rui Sun, Zhecan Wang, Long Chen, Gengyu Wang, Hammad A. Ayyubi, Kai-Wei Chang, Shih-Fu Chang

专题命中复杂问题求解：LLM生成子问题并推理最终答案。

AI总结提出IdealGPT框架，利用大型语言模型迭代分解视觉语言推理任务，通过子问题生成、子答案获取和最终答案推理的循环过程，在零样本设置下显著提升多步推理性能。

Comments 13 pages, 5 figures

URL PDF HTML

1702.06162 2026-06-19 cs.CR 版本更新专题 55

Survey of Automated Vulnerability Detection and Exploit Generation Techniques in Cyber Reasoning Systems

网络推理系统中自动化漏洞检测与利用生成技术综述

Teresa Nicole Brooks

专题命中复杂问题求解：综述自动化漏洞检测与利用生成，涉及推理

AI总结本文综述了DARPA网络大挑战赛中获胜系统Mayhem和Mechanical Phish的自动化漏洞检测与利用生成技术，总结了其核心方法、底层技术及相关工作。

Comments This is the accepted submitted version of this paper that was published in the Intelligent Computing Proceedings of the 2018 Computing Conference, Volume 2

Journal ref Intelligent Computing: Proceedings of the 2018 Computing Conference, Vol. 2, Springer, 2019, pp. 1083-1102

URL PDF HTML