TY - JOUR
T1 - 在轨高效目标检测加速技术
AU - Huyan, Lang
AU - Li, Ying
AU - Jiang, Dongmei
AU - Zhang, Yanning
AU - Zhou, Quan
AU - Wei, Jiayuan
AU - Liu, Juanni
N1 - Publisher Copyright:
© 2022 China Spaceflight Society. All rights reserved.
PY - 2022/11
Y1 - 2022/11
N2 - To solve the problem that deep convolutional neural network object detection algorithms are difficult to deploy on board due to their large number of parameters, large computation, limitations of onboard computing resources, storage resources, and power consumption, an efficient on board object detection algorithm acceleration framework and implementation method are proposed. First of all, a computing engine that can be compatible with three convolutional operators is designed, which effectively improves resource utilization. Secondly, the object detection algorithm model is expanded from the two dimensions of channel and convolution kernel, which realizes the high parallelization and scalability of the accelerator. Finally, the accelerator was implemented on multiple FPGA platforms and its performance was evaluated. Experimental results show that the proposed FPGA based accelerator can achieve up to 1843.2 GFLOPs throughput, and the inference time is 0.22 ms. Compared with accelerators proposed in related literature, the accelerator proposed in this paper has great advantages in terms of performance, power consumption, energy efficiency ratio, and inference time. It is suitable for deployment in resource constrained environments and has good application prospects and values on satellites.
AB - To solve the problem that deep convolutional neural network object detection algorithms are difficult to deploy on board due to their large number of parameters, large computation, limitations of onboard computing resources, storage resources, and power consumption, an efficient on board object detection algorithm acceleration framework and implementation method are proposed. First of all, a computing engine that can be compatible with three convolutional operators is designed, which effectively improves resource utilization. Secondly, the object detection algorithm model is expanded from the two dimensions of channel and convolution kernel, which realizes the high parallelization and scalability of the accelerator. Finally, the accelerator was implemented on multiple FPGA platforms and its performance was evaluated. Experimental results show that the proposed FPGA based accelerator can achieve up to 1843.2 GFLOPs throughput, and the inference time is 0.22 ms. Compared with accelerators proposed in related literature, the accelerator proposed in this paper has great advantages in terms of performance, power consumption, energy efficiency ratio, and inference time. It is suitable for deployment in resource constrained environments and has good application prospects and values on satellites.
KW - Computational intensity
KW - Convolutional neural networks
KW - Model acceleration
KW - Model quantization
KW - Object detection
UR - http://www.scopus.com/inward/record.url?scp=85146762487&partnerID=8YFLogxK
U2 - 10.3873/j.issn.1000-1328.2022.11.011
DO - 10.3873/j.issn.1000-1328.2022.11.011
M3 - 文章
AN - SCOPUS:85146762487
SN - 1000-1328
VL - 43
SP - 1544
EP - 1556
JO - Yuhang Xuebao/Journal of Astronautics
JF - Yuhang Xuebao/Journal of Astronautics
IS - 11
ER -