TY - JOUR
T1 - Scale-Balanced Real-Time Object Detection With Varying Input-Image Resolution
AU - Yan, Longbin
AU - Qin, Yunxiao
AU - Chen, Jie
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Current object-detection methods for small-scale objects are often marred by poor performance. Using relatively high-resolution input images can be considered a remedy for this issue, but it usually leads to performance degeneration for large-scale objects. We define this problem as the imbalance of detection performance for multi-scale objects when the resolution of input images varies. In addition, the use of high-resolution images results in significant computational resource consumption and inference-speed impairment. In this paper, we propose a friendly varying-resolution object-detection method for multi-scale objects. We analyze in detail the reasons leading to the performance degradation in the detection of large-scale objects with increasing input-image resolution, and propose a novel lightweight bidirectional feature-flow module to enhance the performance of multi-scale object detection in high-resolution images, especially for large-scale objects. The proposed approach can also ease the problems of computational resource consumption and inference-speed impairment caused by high-resolution images. Additionally, a decoupled detection head is designed to further improve performance by separating classification and regression sub-tasks, and an adaptive feature-fusion module is designed to better fuse different feature levels. The proposed scheme alleviates the negative effects of using high-resolution input images and achieves an excellent balance between inference speed and precision. Experiments on the MS COCO dataset show that the scheme achieves 44.6 AP at 42.6 FPS and 47 AP at 26.7 FPS, showing significant advantages over the methods to which it is compared.
AB - Current object-detection methods for small-scale objects are often marred by poor performance. Using relatively high-resolution input images can be considered a remedy for this issue, but it usually leads to performance degeneration for large-scale objects. We define this problem as the imbalance of detection performance for multi-scale objects when the resolution of input images varies. In addition, the use of high-resolution images results in significant computational resource consumption and inference-speed impairment. In this paper, we propose a friendly varying-resolution object-detection method for multi-scale objects. We analyze in detail the reasons leading to the performance degradation in the detection of large-scale objects with increasing input-image resolution, and propose a novel lightweight bidirectional feature-flow module to enhance the performance of multi-scale object detection in high-resolution images, especially for large-scale objects. The proposed approach can also ease the problems of computational resource consumption and inference-speed impairment caused by high-resolution images. Additionally, a decoupled detection head is designed to further improve performance by separating classification and regression sub-tasks, and an adaptive feature-fusion module is designed to better fuse different feature levels. The proposed scheme alleviates the negative effects of using high-resolution input images and achieves an excellent balance between inference speed and precision. Experiments on the MS COCO dataset show that the scheme achieves 44.6 AP at 42.6 FPS and 47 AP at 26.7 FPS, showing significant advantages over the methods to which it is compared.
KW - Deep convolution neural network (CNN)
KW - multi-scale features fusion
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85136858723&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2022.3198329
DO - 10.1109/TCSVT.2022.3198329
M3 - 文章
AN - SCOPUS:85136858723
SN - 1051-8215
VL - 33
SP - 242
EP - 256
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 1
ER -