TY - JOUR
T1 - Performance modeling and running strategy of parallel cdugksFOAM program
AU - Wang, Yunlan
AU - Liu, Yufeng
AU - Zhang, Rui
AU - Zhao, Tianhai
AU - Liu, Sha
AU - Zhuo, Congshan
AU - Zhong, Chengwen
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/7
Y1 - 2024/7
N2 - The cdugksFOAM program realizes the physical space grid and the velocity space grid in parallel at the same time. Its distinguishing feature lies in its potential for large-scale parallelism. However, the running time of the cdugksFOAM program is significantly dependent on the number of physical and velocity space partitions. In order to find the optimal partitioning strategies for a specific CFD problem running on a parallel computer, we performed performance modeling of the cdugksFOAM program. Firstly, we proposed a floating-point operations model, a MPI communication volume model, and a memory consumption model. Based on these models, we established a Roofline model to predict the computational time, and a model to predict communication time. According to the computational time model and the communication time model, the execution time model was proposed and its effectiveness was verified with two cases. Finally, the optimal running strategy that minimizes the product of the number of computing nodes and execution time was identified, providing meaningful guidance for the economic execution of the program.
AB - The cdugksFOAM program realizes the physical space grid and the velocity space grid in parallel at the same time. Its distinguishing feature lies in its potential for large-scale parallelism. However, the running time of the cdugksFOAM program is significantly dependent on the number of physical and velocity space partitions. In order to find the optimal partitioning strategies for a specific CFD problem running on a parallel computer, we performed performance modeling of the cdugksFOAM program. Firstly, we proposed a floating-point operations model, a MPI communication volume model, and a memory consumption model. Based on these models, we established a Roofline model to predict the computational time, and a model to predict communication time. According to the computational time model and the communication time model, the execution time model was proposed and its effectiveness was verified with two cases. Finally, the optimal running strategy that minimizes the product of the number of computing nodes and execution time was identified, providing meaningful guidance for the economic execution of the program.
KW - DUGKS
KW - MPI
KW - Performance modeling
KW - Roofline model
UR - http://www.scopus.com/inward/record.url?scp=85189105739&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2024.109186
DO - 10.1016/j.cpc.2024.109186
M3 - 文章
AN - SCOPUS:85189105739
SN - 0010-4655
VL - 300
JO - Computer Physics Communications
JF - Computer Physics Communications
M1 - 109186
ER -