HFOD: A hardware-friendly quantization method for object detection on embedded FPGAs

Fei Zhang; Ziyang Gao; Jiaming Huang; Peining Zhen; Hai Bao Chen; Jie Yan

doi:10.1587/elex.19.20220067

HFOD: A hardware-friendly quantization method for object detection on embedded FPGAs

Fei Zhang, Ziyang Gao, Jiaming Huang, Peining Zhen, Hai Bao Chen, Jie Yan

航天学院

科研成果: 期刊稿件 › 文章 › 同行评审

5 引用（Scopus）

摘要

There are two research hotspots for improving performance and energy efficiency of the inference phase of Convolutional neural networks (CNNs). The first one is model compression techniques while the second is hardware accelerator implementation. To overcome the incompatibility of algorithm optimization and hardware design, this paper proposes HFOD, a hardware-friendly quantization method for object detection on embedded FPGAs. We adopt a channel-wise, uniform quantization method to compress YOLOv3-Tiny model. Weights are quantized to 2-bit while activations are quantized to 8-bit for all convolutional layers. To achieve highly-efficient implementations on FPGA, we add batch normalization (BN) layer fusion in quantization process. A flexible, efficient convolutional unit structure is designed to utilize hardware-friendly quantization, and the accelerator is developed based on an automatic synthesis template. Experimental results show that the resources of FPGA in the proposed accelerator design contribute more computing performance compared with regular 8-bit/16-bit fixed point quantization. The model size and the activation size of the proposed network with 2-bit weights and 8-bit activations can be effectively reduced by 16× and 4× with a small amount of accuracy loss, respectively. Our HFOD method can achieve 90.6 GOPS on PYNQZ2 at 150 MHz, which is 1.4× faster and 2× better in power efficiency than peer FPGA implementation on the same platform.

源语言	英语
期刊	IEICE Electronics Express
卷	19
期	8
DOI	https://doi.org/10.1587/elex.19.20220067
出版状态	已出版 - 25 4月 2022

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1587/elex.19.20220067

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{539ab95dea2e4f38acd717a7406bbba6,

title = "HFOD: A hardware-friendly quantization method for object detection on embedded FPGAs",

abstract = "There are two research hotspots for improving performance and energy efficiency of the inference phase of Convolutional neural networks (CNNs). The first one is model compression techniques while the second is hardware accelerator implementation. To overcome the incompatibility of algorithm optimization and hardware design, this paper proposes HFOD, a hardware-friendly quantization method for object detection on embedded FPGAs. We adopt a channel-wise, uniform quantization method to compress YOLOv3-Tiny model. Weights are quantized to 2-bit while activations are quantized to 8-bit for all convolutional layers. To achieve highly-efficient implementations on FPGA, we add batch normalization (BN) layer fusion in quantization process. A flexible, efficient convolutional unit structure is designed to utilize hardware-friendly quantization, and the accelerator is developed based on an automatic synthesis template. Experimental results show that the resources of FPGA in the proposed accelerator design contribute more computing performance compared with regular 8-bit/16-bit fixed point quantization. The model size and the activation size of the proposed network with 2-bit weights and 8-bit activations can be effectively reduced by 16× and 4× with a small amount of accuracy loss, respectively. Our HFOD method can achieve 90.6 GOPS on PYNQZ2 at 150 MHz, which is 1.4× faster and 2× better in power efficiency than peer FPGA implementation on the same platform.",

keywords = "convolutional neural networks, highly-efficient implementation, quantization",

author = "Fei Zhang and Ziyang Gao and Jiaming Huang and Peining Zhen and Chen, {Hai Bao} and Jie Yan",

note = "Publisher Copyright: {\textcopyright} 2022 The Institute of Electronics, Information and Communication Engineers",

year = "2022",

month = apr,

day = "25",

doi = "10.1587/elex.19.20220067",

language = "英语",

volume = "19",

journal = "IEICE Electronics Express",

issn = "1349-2543",

publisher = "The Institute of Electronics, Information and Communication Engineers (IEICE)",

number = "8",

}

TY - JOUR

T1 - HFOD

T2 - A hardware-friendly quantization method for object detection on embedded FPGAs

AU - Zhang, Fei

AU - Gao, Ziyang

AU - Huang, Jiaming

AU - Zhen, Peining

AU - Chen, Hai Bao

AU - Yan, Jie

PY - 2022/4/25

Y1 - 2022/4/25

N2 - There are two research hotspots for improving performance and energy efficiency of the inference phase of Convolutional neural networks (CNNs). The first one is model compression techniques while the second is hardware accelerator implementation. To overcome the incompatibility of algorithm optimization and hardware design, this paper proposes HFOD, a hardware-friendly quantization method for object detection on embedded FPGAs. We adopt a channel-wise, uniform quantization method to compress YOLOv3-Tiny model. Weights are quantized to 2-bit while activations are quantized to 8-bit for all convolutional layers. To achieve highly-efficient implementations on FPGA, we add batch normalization (BN) layer fusion in quantization process. A flexible, efficient convolutional unit structure is designed to utilize hardware-friendly quantization, and the accelerator is developed based on an automatic synthesis template. Experimental results show that the resources of FPGA in the proposed accelerator design contribute more computing performance compared with regular 8-bit/16-bit fixed point quantization. The model size and the activation size of the proposed network with 2-bit weights and 8-bit activations can be effectively reduced by 16× and 4× with a small amount of accuracy loss, respectively. Our HFOD method can achieve 90.6 GOPS on PYNQZ2 at 150 MHz, which is 1.4× faster and 2× better in power efficiency than peer FPGA implementation on the same platform.

AB - There are two research hotspots for improving performance and energy efficiency of the inference phase of Convolutional neural networks (CNNs). The first one is model compression techniques while the second is hardware accelerator implementation. To overcome the incompatibility of algorithm optimization and hardware design, this paper proposes HFOD, a hardware-friendly quantization method for object detection on embedded FPGAs. We adopt a channel-wise, uniform quantization method to compress YOLOv3-Tiny model. Weights are quantized to 2-bit while activations are quantized to 8-bit for all convolutional layers. To achieve highly-efficient implementations on FPGA, we add batch normalization (BN) layer fusion in quantization process. A flexible, efficient convolutional unit structure is designed to utilize hardware-friendly quantization, and the accelerator is developed based on an automatic synthesis template. Experimental results show that the resources of FPGA in the proposed accelerator design contribute more computing performance compared with regular 8-bit/16-bit fixed point quantization. The model size and the activation size of the proposed network with 2-bit weights and 8-bit activations can be effectively reduced by 16× and 4× with a small amount of accuracy loss, respectively. Our HFOD method can achieve 90.6 GOPS on PYNQZ2 at 150 MHz, which is 1.4× faster and 2× better in power efficiency than peer FPGA implementation on the same platform.

KW - convolutional neural networks

KW - highly-efficient implementation

KW - quantization

UR - http://www.scopus.com/inward/record.url?scp=85130715965&partnerID=8YFLogxK

U2 - 10.1587/elex.19.20220067

DO - 10.1587/elex.19.20220067

M3 - 文章

AN - SCOPUS:85130715965

SN - 1349-2543

VL - 19

JO - IEICE Electronics Express

JF - IEICE Electronics Express

IS - 8

ER -

HFOD: A hardware-friendly quantization method for object detection on embedded FPGAs

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此