基于晶格Boltzmann方法的CUDA加速优化

张乾毅; 韦华健; 赫轶男; 李华兵

基于晶格Boltzmann方法的CUDA加速优化

CUDA accelerated optimization based on lattice Boltzmann method

摘要

摘要: 为提高流体的计算效率并保证结果的准确性，利用CUDA编程平台和GPU强大的浮点计算能力，实现了基于晶格玻尔兹曼方法的泊松流模拟计算加速。设计了线性寻址和下标寻址2种不同寻址方式，将这2种寻址方式分别应用到晶格玻尔兹曼程序的格点碰撞、迁徙流动、宏观量计算等步骤中，并探讨2种寻址方式对程序计算效率带来的影响。同时在程序中使用统一内存管理，通过这样的方式开辟内存的变量可在主机端和设备端同时使用，简化了代码复杂度，同时降低了频繁为变量开辟内存带来的消耗。使用Intel^® Xeon^® E-52620 v4 CPU，Nvidia Quadro GP100 GPU进行计算，在线性寻址方法和下标寻址方法中分别获得了71倍和25倍CPU串行代码的加速比。

Abstract: In order to improve the efficiency of fluid calculations and ensure the accuracy of the results, the CUDA programming platform and the powerful floating-point computing capabilities of the GPU are used to accelerate the Poisseuille flow simulation calculation based on the lattice Boltzmann method.Two different addressing methods, linear addressing and subscript addressing are designed, these two addressing methods are respectively applied to the lattice point collision, migration flow, and macroscopic calculation of the lattice Boltzmann program, then discuss the influence of two addressing methods on the calculation efficiency of the program. At the same time, unified memory management is used in the program, and the variables opened up in this way can be used on the host side and the device side at the same time, which simplifies the code complexity and reduces the consumption of frequently opening up memory for variables.Using Intel(R) Xeon(R) E-52620 v4 CPU and Nvidia Quadro GP100GPU for calculations, the linear addressing method and the subscript addressing method have obtained 71 times and 25 times the speedup ratio of CPU serial code respectively.

HTML全文

参考文献(15)

施引文献

资源附件(0)