TY - JOUR
T1 - On Improving Bounding Box Representations for Oriented Object Detection
AU - Yao, Yanqing
AU - Cheng, Gong
AU - Wang, Guangxing
AU - Li, Shengyang
AU - Zhou, Peicheng
AU - Xie, Xingxing
AU - Han, Junwei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Detecting objects in remote sensing images (RSIs) using oriented bounding boxes (OBBs) is flourishing but challenging, wherein the design of OBB representations is the key to achieving accurate detection. In this article, we focus on two issues that hinder the performance of the two-stage oriented detectors: 1) the notorious boundary discontinuity problem, which would result in significant loss increases in boundary conditions, and 2) the inconsistency in regression schemes between the two stages. We propose a simple and effective bounding box representation by drawing inspiration from the polar coordinate system and integrate it into two detection stages to circumvent the two issues. The first stage specifically initializes four quadrant points as the starting points of the regression for producing high-quality oriented candidates without any postprocessing. In the second stage, the final localization results are refined using the proposed novel bounding box representation, which can fully release the capabilities of the oriented detectors. Such consistency brings a good trade-off between accuracy and speed. With only flipping augmentation and single-scale training and testing, our approach with ResNet-50-FPN harvests 76.25% mAP on the DOTA dataset with a speed of up to 16.5 frames/s, achieving the best accuracy and the fastest speed among the mainstream two-stage oriented detectors. Additional results on the DIOR-R and HRSC2016 datasets also demonstrate the effectiveness and robustness of our method. The source code is publicly available at https://github.com/yanqingyao1994/QPDet.
AB - Detecting objects in remote sensing images (RSIs) using oriented bounding boxes (OBBs) is flourishing but challenging, wherein the design of OBB representations is the key to achieving accurate detection. In this article, we focus on two issues that hinder the performance of the two-stage oriented detectors: 1) the notorious boundary discontinuity problem, which would result in significant loss increases in boundary conditions, and 2) the inconsistency in regression schemes between the two stages. We propose a simple and effective bounding box representation by drawing inspiration from the polar coordinate system and integrate it into two detection stages to circumvent the two issues. The first stage specifically initializes four quadrant points as the starting points of the regression for producing high-quality oriented candidates without any postprocessing. In the second stage, the final localization results are refined using the proposed novel bounding box representation, which can fully release the capabilities of the oriented detectors. Such consistency brings a good trade-off between accuracy and speed. With only flipping augmentation and single-scale training and testing, our approach with ResNet-50-FPN harvests 76.25% mAP on the DOTA dataset with a speed of up to 16.5 frames/s, achieving the best accuracy and the fastest speed among the mainstream two-stage oriented detectors. Additional results on the DIOR-R and HRSC2016 datasets also demonstrate the effectiveness and robustness of our method. The source code is publicly available at https://github.com/yanqingyao1994/QPDet.
KW - Oriented object detection
KW - quadrant point regression
KW - rotated box refinement
KW - rotated proposal generation
UR - http://www.scopus.com/inward/record.url?scp=85146247920&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2022.3231340
DO - 10.1109/TGRS.2022.3231340
M3 - 文章
AN - SCOPUS:85146247920
SN - 0196-2892
VL - 61
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5600111
ER -