2605.24828
2026-06-02
cs.AI
Test-Time Deep Thinking to Explore Implicit Rules
测试时深度思考以探索隐式规则
Wentong Chen, Xin Cong, Zhong Zhang, Yaxi Lu, Siyuan Zhao, Yesai Wu, Qinyu Luo, Haotian Chen, Yankai Lin, Zhiyuan Liu, Maosong Sun
发表机构
*
Renmin University of China(中国人民大学)
;
Department of Statistics and Data Science, Tsinghua University(清华大学统计与数据科学系)
;
School of Computer Science and Engineering, UESTC(UESTC计算机科学与工程学院)
;
Department of Computer Science and Technology, Tsinghua University(清华大学计算机科学与技术系)
;
School of Mathematical Sciences, Nankai University(南开大学数学科学学院)
;
Whiting School, Johns Hopkins University(约翰斯·霍普金斯大学惠特林学院)
;
School of Artificial Intelligence, Shanghai Jiaotong University(上海交通大学人工智能学院)
AI总结
针对智能体在隐式规则环境中失败的问题,提出TTExplore框架,通过训练专用模型Exp-Thinker进行测试时推理,平均提升基线性能14-19点。