Realistic acceleration of neural networks with fine-grained tensor decomposition

Rui Lv; Dingheng Wang; Jiangbin Zheng; Yefan Xie; Zhao Xu Yang

doi:10.1016/j.neucom.2022.09.057

Realistic acceleration of neural networks with fine-grained tensor decomposition

Rui Lv, Dingheng Wang, Jiangbin Zheng, Yefan Xie, Zhao Xu Yang

School of Software

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.

Original language	English
Pages (from-to)	52-68
Number of pages	17
Journal	Neurocomputing
Volume	512
DOIs	https://doi.org/10.1016/j.neucom.2022.09.057
State	Published - 1 Nov 2022

Keywords

Complexity comparison
Kronecker CANDECOMP/PARAFAC
Neural network compression
Tensor decomposition
Transposition-free algorithm

Access to Document

10.1016/j.neucom.2022.09.057

Cite this

@article{8f4f1442e53445f499b3c4b8fd28ff33,

title = "Realistic acceleration of neural networks with fine-grained tensor decomposition",

abstract = "As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.",

keywords = "Complexity comparison, Kronecker CANDECOMP/PARAFAC, Neural network compression, Tensor decomposition, Transposition-free algorithm",

author = "Rui Lv and Dingheng Wang and Jiangbin Zheng and Yefan Xie and Yang, {Zhao Xu}",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier B.V.",

year = "2022",

month = nov,

day = "1",

doi = "10.1016/j.neucom.2022.09.057",

language = "英语",

volume = "512",

pages = "52--68",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Realistic acceleration of neural networks with fine-grained tensor decomposition

AU - Lv, Rui

AU - Wang, Dingheng

AU - Zheng, Jiangbin

AU - Xie, Yefan

AU - Yang, Zhao Xu

PY - 2022/11/1

Y1 - 2022/11/1

N2 - As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.

AB - As the modern deep neural networks (DNNs) have become more and more large-scale and expensive, the topic of DNN compression grows into a hot direction nowadays. Among variant compression methods, tensor decomposition seems to be the most promising and low-cost one because of its solid mathematical foundations and regular data structure. However, most of the existing tensor decompositions are not very good at accelerating DNNs, because there are always necessary transpositions on tensor modes to make the input data calculate with the decomposed factor tensors correctly, and transposition will bring extra memory and time cost for the realistic system without doubt. In this paper, we select a relatively novel Kronecker CANDECOMP/PARAFAC (KCP) tensor decomposition which has fine-grained factor tensors, and propose the transposition-free algorithm to calculate the contractions between the input data and the neural weight in KCP format. The theoretically analysis of computation complexity indicates that the proposed method is much more efficient than the existing algorithms. We further prove that the training complexity of KCP-DNN based on the proposed transposition-free algorithm can also be faster than the traditional ones, and make a comprehensive comparison of space and computation complexity including training and inference stages to show the superiority of our method. As a series of related works pay more attention to the recurrent neural networks (RNNs), we follow these existing practices and focus on the KCP-RNN to make a comprehensive comparison with them, and the experimental results show our KCP-RNN with transposition-free algorithm has systematically advantages including accuracy, space complexity, computation complexity, and realistic running time. Besides, some advanced characteristics of KCP-DNN such as collocation of ranks, have also been discussed.

KW - Complexity comparison

KW - Kronecker CANDECOMP/PARAFAC

KW - Neural network compression

KW - Tensor decomposition

KW - Transposition-free algorithm

UR - http://www.scopus.com/inward/record.url?scp=85138455942&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2022.09.057

DO - 10.1016/j.neucom.2022.09.057

M3 - 文章

AN - SCOPUS:85138455942

SN - 0925-2312

VL - 512

SP - 52

EP - 68

JO - Neurocomputing

JF - Neurocomputing

ER -

Realistic acceleration of neural networks with fine-grained tensor decomposition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this