SMC-AI: Scaling Monte Carlo Simulation to Four Trillion Atoms with AI Accelerators
Xianglin Liu, Kai Yang, Fanli Zhou, Yongxiang Liu, Hao Chen, Yijia Zhang, Dengdong Fan, Wenbo Li, Bingqiang Wang, Shixun Zhang, Pengxiang Xu, Yonghong Tian
详情
The rapid advancement of deep learning is reshaping the hardware design landscape toward AI tasks, posing fundamental challenges for HPC workloads such as atomistic simulation. Here we present SMC-AI, a general algorithmic framework that extends the SMC-X method for efficient canonical Monte Carlo simulation on AI accelerators, including GPUs and NPUs, while maintaining extreme scalability. The implementation of SMC-AI on an NPU cluster reaches unprecedented performance, achieving MC simulation of 4 trillion atoms on 4096 NPU dies. This represents the largest ML-accelerated atomistic simulation reported, delivering 32X system size and 1.3X throughput than previous records, with a relatively small computational budget. Excellent strong and weak scaling efficiency are reached for both the NPU and GPU implementation. By decoupling ML models from simulation, SMC-AI creates an abstraction that facilitates integration and porting of diverse ML models, laying a foundation for the future development of scalable scientific software.