TY - JOUR
T1 - Towards improving classification power for one-shot object detection
AU - Yang, Hanqing
AU - Lin, Yongliang
AU - Zhang, Hong
AU - Zhang, Yu
AU - Xu, Bin
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/9/30
Y1 - 2021/9/30
N2 - Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.
AB - Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.
KW - Deep learning
KW - False positives
KW - One-shot object detection
KW - Siamese convolutional network
UR - http://www.scopus.com/inward/record.url?scp=85108724646&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2021.04.116
DO - 10.1016/j.neucom.2021.04.116
M3 - 文章
AN - SCOPUS:85108724646
SN - 0925-2312
VL - 455
SP - 390
EP - 400
JO - Neurocomputing
JF - Neurocomputing
ER -