TY - JOUR
T1 - Learning Orientation-Aware Distances for Oriented Object Detection
AU - Rao, Chaofan
AU - Wang, Jiabao
AU - Cheng, Gong
AU - Xie, Xingxing
AU - Han, Junwei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Oriented object detectors have suffered severely from the discontinuous boundary problem for a long time. In this work, we ingeniously avoid this problem by relating regression outputs to regression target orientations. The core idea of our method is to build a contour function which imports orientations and outputs the corresponding distance predictions. Inspired by Fourier transformations, we assume this function can be represented as a linear combination of trigonometric functions and Fourier series. We replace the final 4-D layer in the regression branch of fully convolutional one-stage object detector (FCOS) with a Fourier series transformation (FST) module and term this new network FCOSF. By this unique design, the regression outputs in FCOSF can adaptively vary according to the regression target orientations. Thus, the discontinuous boundary has no impact on our FCOSF. More importantly, FCOSF avoids building complicated oriented box representations, which usually cause extra computations and ambiguities. With only flipping augmentation and single-scale training and testing, FCOSF with ResNet-50 achieves 73.64% mean average precision (mAP) on the DOTA-v1.0 dataset with up to 23.6-frames/s speed, surpassing all one-stage oriented object detectors. On the more challenging DOTA-v2.0 dataset, FCOSF also achieves the highest results of 51.75% mAP among one-stage detectors. More experiments on DIOR-R and HRSC2016 are also conducted to verify the robustness of FCOSF. Code and models will be available at https://github.com/DDGRCF/FCOSF.
AB - Oriented object detectors have suffered severely from the discontinuous boundary problem for a long time. In this work, we ingeniously avoid this problem by relating regression outputs to regression target orientations. The core idea of our method is to build a contour function which imports orientations and outputs the corresponding distance predictions. Inspired by Fourier transformations, we assume this function can be represented as a linear combination of trigonometric functions and Fourier series. We replace the final 4-D layer in the regression branch of fully convolutional one-stage object detector (FCOS) with a Fourier series transformation (FST) module and term this new network FCOSF. By this unique design, the regression outputs in FCOSF can adaptively vary according to the regression target orientations. Thus, the discontinuous boundary has no impact on our FCOSF. More importantly, FCOSF avoids building complicated oriented box representations, which usually cause extra computations and ambiguities. With only flipping augmentation and single-scale training and testing, FCOSF with ResNet-50 achieves 73.64% mean average precision (mAP) on the DOTA-v1.0 dataset with up to 23.6-frames/s speed, surpassing all one-stage oriented object detectors. On the more challenging DOTA-v2.0 dataset, FCOSF also achieves the highest results of 51.75% mAP among one-stage detectors. More experiments on DIOR-R and HRSC2016 are also conducted to verify the robustness of FCOSF. Code and models will be available at https://github.com/DDGRCF/FCOSF.
KW - Fourier series transformation (FST)
KW - orientation-aware distance
KW - oriented object detection
KW - remote sensing images
UR - http://www.scopus.com/inward/record.url?scp=85161084452&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2023.3278933
DO - 10.1109/TGRS.2023.3278933
M3 - 文章
AN - SCOPUS:85161084452
SN - 0196-2892
VL - 61
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5610911
ER -