神经网络轻量化综述

Yuchen Duan, Zhenyu Fang, Jiangbin Zheng

科研成果: 期刊稿件文献综述同行评审

摘要

With the continuous progress of deep learning technology, artificial neural network models have shown unprecedented performance in many fields such as image recognition, natural language processing, and autonomous driving. These models often have millions or even billions of parameters and learn complex feature representations through large amounts of training data. However, in resource-constrained environments, such as mobile devices, embedded systems and other edge computing scenarios, the power consumption, memory usage and computing efficiency of the model limit the application of large-scale neural network models. To solve this problem, the researchers have proposed a variety of model compression techniques, such as pruning, distillation, neural network search (NAS), quantization, and low-rank decomposition, which aim to reduce the number of parameters, computational complexity, and storage requirements of the model, while maintaining the accuracy of the model as much as possible. The following is a systematic introduction to the development process of these model compression methods, focusing on the main principles and key technologies of each method. It mainly includes different strategies of pruning techniques, such as structured pruning and unstructured pruning; how to define knowledge in knowledge distillation; search space, search algorithm and network performance evaluation in NAS; post-training quantization and in-training quantization in quantization; and the singular value decomposition and tensor decomposition in low rank decomposition. Finally, the future development direction of model compression technology is discussed.

投稿的翻译标题Review of Neural Network Lightweight
源语言繁体中文
页(从-至)835-853
页数19
期刊Journal of Frontiers of Computer Science and Technology
19
4
DOI
出版状态已出版 - 1 4月 2025

关键词

  • knowledge distillation
  • low-rank decomposition
  • neural network search (NAS)
  • pruning
  • quantization

指纹

探究 '神经网络轻量化综述' 的科研主题。它们共同构成独一无二的指纹。

引用此