TY - JOUR
T1 - Realistic acceleration of neural networks with fine-grained tensor decomposition
AU - Lv, Rui
AU - Wang, Dingheng
AU - Zheng, Jiangbin
AU - Xie, Yefan
AU - Yang, Zhao Xu
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/11/1
Y1 - 2022/11/1
N2 - As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.
AB - As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.
KW - Complexity comparison
KW - Kronecker CANDECOMP/PARAFAC
KW - Neural network compression
KW - Tensor decomposition
KW - Transposition-free algorithm
UR - http://www.scopus.com/inward/record.url?scp=85138455942&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2022.09.057
DO - 10.1016/j.neucom.2022.09.057
M3 - 文章
AN - SCOPUS:85138455942
SN - 0925-2312
VL - 512
SP - 52
EP - 68
JO - Neurocomputing
JF - Neurocomputing
ER -