TY - JOUR
T1 - 一种基于 LLVM Pass 的复杂嵌套循环自动并行化框架
AU - Ma, Chun Yan
AU - Lv, Bing Xu
AU - Ye, Xu Jiao
AU - Zhang, Yu
N1 - Publisher Copyright:
© 2023 Chinese Academy of Sciences. All rights reserved.
PY - 2023
Y1 - 2023
N2 - With the popularization of multi-core processors, automatic parallelization of serial codes in embedded legacy systems is a research hotspot. Among them, there are technical challenges in the automatic parallelization method for complex nested loops with imperfect nested structure and non-affine dependency characteristics. This paper proposes an automatic parallelization framework (CNLPF) for complex nested loops based on LLVM Pass. Firstly, a representation model of complex nested loops, namely loop structure tree, is proposed, and the regular region of nested loops is automatically converted into a loop structure tree representation. Then, the data dependency analysis is carried out on the loop structure tree to construct intra-loop and inter-loop dependency relationship. Finally, the parallel loop program is generated based on the OpenMP shared memory programming model. For the 6 program cases in the SPEC2006 data set containing nearly 500 complex nested loops, the statistics of the proportion of complex nested loops and the parallel performance acceleration test were carried out respectively. The results show that the automatic parallelization framework proposed in this paper can deal with complex nested loops that cannot be optimized by LLVM Polly, which enhances the parallel compilation and optimization capabilities of LLVM, and the method combined with Polly optimization improves the acceleration effect of Polly optimization alone by 9%~43%.
AB - With the popularization of multi-core processors, automatic parallelization of serial codes in embedded legacy systems is a research hotspot. Among them, there are technical challenges in the automatic parallelization method for complex nested loops with imperfect nested structure and non-affine dependency characteristics. This paper proposes an automatic parallelization framework (CNLPF) for complex nested loops based on LLVM Pass. Firstly, a representation model of complex nested loops, namely loop structure tree, is proposed, and the regular region of nested loops is automatically converted into a loop structure tree representation. Then, the data dependency analysis is carried out on the loop structure tree to construct intra-loop and inter-loop dependency relationship. Finally, the parallel loop program is generated based on the OpenMP shared memory programming model. For the 6 program cases in the SPEC2006 data set containing nearly 500 complex nested loops, the statistics of the proportion of complex nested loops and the parallel performance acceleration test were carried out respectively. The results show that the automatic parallelization framework proposed in this paper can deal with complex nested loops that cannot be optimized by LLVM Polly, which enhances the parallel compilation and optimization capabilities of LLVM, and the method combined with Polly optimization improves the acceleration effect of Polly optimization alone by 9%~43%.
KW - automatic parallelization
KW - complex nested loops
KW - dependency analysis
KW - LLVM Pass
UR - http://www.scopus.com/inward/record.url?scp=85161449416&partnerID=8YFLogxK
U2 - 10.13328/j.cnki.jos.006858
DO - 10.13328/j.cnki.jos.006858
M3 - 文章
AN - SCOPUS:85161449416
SN - 1000-9825
VL - 34
JO - Ruan Jian Xue Bao/Journal of Software
JF - Ruan Jian Xue Bao/Journal of Software
IS - 7
ER -