Towards improving classification power for one-shot object detection

Hanqing Yang; Yongliang Lin; Hong Zhang; Yu Zhang; Bin Xu

doi:10.1016/j.neucom.2021.04.116

Towards improving classification power for one-shot object detection

Hanqing Yang, Yongliang Lin, Hong Zhang, Yu Zhang, Bin Xu

自动化学院

Zhejiang University

科研成果: 期刊稿件 › 文章 › 同行评审

17 引用（Scopus）

摘要

Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.

源语言	英语
页（从-至）	390-400
页数	11
期刊	Neurocomputing
卷	455
DOI	https://doi.org/10.1016/j.neucom.2021.04.116
出版状态	已出版 - 30 9月 2021

访问文件

10.1016/j.neucom.2021.04.116

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{076641b01f4f42498784ac57f4c6ce37,

title = "Towards improving classification power for one-shot object detection",

abstract = "Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.",

keywords = "Deep learning, False positives, One-shot object detection, Siamese convolutional network",

author = "Hanqing Yang and Yongliang Lin and Hong Zhang and Yu Zhang and Bin Xu",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier B.V.",

year = "2021",

month = sep,

day = "30",

doi = "10.1016/j.neucom.2021.04.116",

language = "英语",

volume = "455",

pages = "390--400",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Towards improving classification power for one-shot object detection

AU - Yang, Hanqing

AU - Lin, Yongliang

AU - Zhang, Hong

AU - Zhang, Yu

AU - Xu, Bin

PY - 2021/9/30

Y1 - 2021/9/30

N2 - Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.

AB - Object detection based on deep learning typically relies on a large number of training data, which may be very labor-consuming to prepare. In this paper, we attempt to tackle the problem by addressing the One-Shot Object Detection (OSOD) task. Given a novel image denoted as the query image whose category label is not included in the training data, OSOD aims to detect objects of the same class in a complex scene denoted as the target image. The performance of recent OSOD methods is much weaker than general object detection. We find that one of the reasons behind this limited performance is that more false positives (i.e., false detections) are generated. Therefore, we argue that it is important to reduce the number of false positives generated in OSOD task to improve performance. To this end, we present a Focus On Classification One-Shot Object Detection (FOC OSOD) network. Specifically, we design the network from two perspectives: (1) how to obtain the effective similarity feature between the query image and target image; (2) how to classify the similarity feature effectively. To solve the above two challenges, firstly, we propose a Classification Feature Deformation-and-Attention (CFDA) module to obtain the high-quality query feature and target feature, so we can further generate effective similarity feature between them. Secondly, we present a Split Iterative Head (SIH) to improve the ability to classify the similarity feature. Extensive experiments on two public datasets (i.e., PASCAL VOC and COCO) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.

KW - Deep learning

KW - False positives

KW - One-shot object detection

KW - Siamese convolutional network

UR - http://www.scopus.com/inward/record.url?scp=85108724646&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2021.04.116

DO - 10.1016/j.neucom.2021.04.116

M3 - 文章

AN - SCOPUS:85108724646

SN - 0925-2312

VL - 455

SP - 390

EP - 400

JO - Neurocomputing

JF - Neurocomputing

ER -

Towards improving classification power for one-shot object detection

摘要

访问文件

其它文件与链接

指纹

引用此