A parallel lattice Boltzmann method for large eddy simulation on multiple GPUs

Qinjian Li; Chengwen Zhong; Kai Li; Guangyong Zhang; Xiaowei Lu; Qing Zhang; Kaiyong Zhao; Xiaowen Chu

doi:10.1007/s00607-013-0356-7

A parallel lattice Boltzmann method for large eddy simulation on multiple GPUs

Qinjian Li, Chengwen Zhong, Kai Li, Guangyong Zhang, Xiaowei Lu, Qing Zhang, Kaiyong Zhao, Xiaowen Chu

School of Aeronautics

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

To improve the simulation efficiency of turbulent fluid flows at high Reynolds numbers with large eddy dynamics, a CUDA-based simulation solution of lattice Boltzmann method for large eddy simulation (LES) using multiple graphics processing units (GPUs) is proposed. Our solution adopts the "collision after propagation" lattice evolutionway and puts the misaligned propagation phase at global memory read process. The latest GPU platform allows a single CPU thread to control up to four GPUs that run in parallel. In order to make use of multiple GPUs, the whole working set is evenly partitioned into sub-domains. We implement Smagorinsky model and Vreman model respectively to verify our multi-GPU solution. These two LES models have different relaxation time calculation behavior and lead to different CUDA implementation characteristics. The implementation based on Smagorinsky model achieves 190 times speedup over the sequential implementation on CPU,while the implementation based on Vreman model archives more than 90 times speedup. The experimental results show that the parallel performance of our multi-GPU solution scales very well on multiple GPUs. Therefore large-scale (up to 10,240 × 10,240 lattices) LES-LBM simulation becomes possible at a low cost, even using double-precision floating point calculation.

Original language	English
Pages (from-to)	479-501
Number of pages	23
Journal	Computing
Volume	96
Issue number	6
DOIs	https://doi.org/10.1007/s00607-013-0356-7
State	Published - Jun 2014

Keywords

CUDA
GPU Computing
Large eddy simulation
Lattice Boltzmann method

Access to Document

10.1007/s00607-013-0356-7

Cite this

@article{748cfd148ed3443cab1b9eb266baa8a1,

title = "A parallel lattice Boltzmann method for large eddy simulation on multiple GPUs",

abstract = "To improve the simulation efficiency of turbulent fluid flows at high Reynolds numbers with large eddy dynamics, a CUDA-based simulation solution of lattice Boltzmann method for large eddy simulation (LES) using multiple graphics processing units (GPUs) is proposed. Our solution adopts the {"}collision after propagation{"} lattice evolutionway and puts the misaligned propagation phase at global memory read process. The latest GPU platform allows a single CPU thread to control up to four GPUs that run in parallel. In order to make use of multiple GPUs, the whole working set is evenly partitioned into sub-domains. We implement Smagorinsky model and Vreman model respectively to verify our multi-GPU solution. These two LES models have different relaxation time calculation behavior and lead to different CUDA implementation characteristics. The implementation based on Smagorinsky model achieves 190 times speedup over the sequential implementation on CPU,while the implementation based on Vreman model archives more than 90 times speedup. The experimental results show that the parallel performance of our multi-GPU solution scales very well on multiple GPUs. Therefore large-scale (up to 10,240 × 10,240 lattices) LES-LBM simulation becomes possible at a low cost, even using double-precision floating point calculation.",

keywords = "CUDA, GPU Computing, Large eddy simulation, Lattice Boltzmann method",

author = "Qinjian Li and Chengwen Zhong and Kai Li and Guangyong Zhang and Xiaowei Lu and Qing Zhang and Kaiyong Zhao and Xiaowen Chu",

year = "2014",

month = jun,

doi = "10.1007/s00607-013-0356-7",

language = "英语",

volume = "96",

pages = "479--501",

journal = "Computing",

issn = "0010-485X",

publisher = "Springer",

number = "6",

}

TY - JOUR

T1 - A parallel lattice Boltzmann method for large eddy simulation on multiple GPUs

AU - Li, Qinjian

AU - Zhong, Chengwen

AU - Li, Kai

AU - Zhang, Guangyong

AU - Lu, Xiaowei

AU - Zhang, Qing

AU - Zhao, Kaiyong

AU - Chu, Xiaowen

PY - 2014/6

Y1 - 2014/6

N2 - To improve the simulation efficiency of turbulent fluid flows at high Reynolds numbers with large eddy dynamics, a CUDA-based simulation solution of lattice Boltzmann method for large eddy simulation (LES) using multiple graphics processing units (GPUs) is proposed. Our solution adopts the "collision after propagation" lattice evolutionway and puts the misaligned propagation phase at global memory read process. The latest GPU platform allows a single CPU thread to control up to four GPUs that run in parallel. In order to make use of multiple GPUs, the whole working set is evenly partitioned into sub-domains. We implement Smagorinsky model and Vreman model respectively to verify our multi-GPU solution. These two LES models have different relaxation time calculation behavior and lead to different CUDA implementation characteristics. The implementation based on Smagorinsky model achieves 190 times speedup over the sequential implementation on CPU,while the implementation based on Vreman model archives more than 90 times speedup. The experimental results show that the parallel performance of our multi-GPU solution scales very well on multiple GPUs. Therefore large-scale (up to 10,240 × 10,240 lattices) LES-LBM simulation becomes possible at a low cost, even using double-precision floating point calculation.

AB - To improve the simulation efficiency of turbulent fluid flows at high Reynolds numbers with large eddy dynamics, a CUDA-based simulation solution of lattice Boltzmann method for large eddy simulation (LES) using multiple graphics processing units (GPUs) is proposed. Our solution adopts the "collision after propagation" lattice evolutionway and puts the misaligned propagation phase at global memory read process. The latest GPU platform allows a single CPU thread to control up to four GPUs that run in parallel. In order to make use of multiple GPUs, the whole working set is evenly partitioned into sub-domains. We implement Smagorinsky model and Vreman model respectively to verify our multi-GPU solution. These two LES models have different relaxation time calculation behavior and lead to different CUDA implementation characteristics. The implementation based on Smagorinsky model achieves 190 times speedup over the sequential implementation on CPU,while the implementation based on Vreman model archives more than 90 times speedup. The experimental results show that the parallel performance of our multi-GPU solution scales very well on multiple GPUs. Therefore large-scale (up to 10,240 × 10,240 lattices) LES-LBM simulation becomes possible at a low cost, even using double-precision floating point calculation.

KW - CUDA

KW - GPU Computing

KW - Large eddy simulation

KW - Lattice Boltzmann method

UR - http://www.scopus.com/inward/record.url?scp=84901617575&partnerID=8YFLogxK

U2 - 10.1007/s00607-013-0356-7

DO - 10.1007/s00607-013-0356-7

M3 - 文章

AN - SCOPUS:84901617575

SN - 0010-485X

VL - 96

SP - 479

EP - 501

JO - Computing

JF - Computing

IS - 6

ER -

A parallel lattice Boltzmann method for large eddy simulation on multiple GPUs

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this