TY - JOUR
T1 - Hierarchical Mask Prompting and Robust Integrated Regression for Oriented Object Detection
AU - Yao, Yanqing
AU - Cheng, Gong
AU - Lang, Chunbo
AU - Yuan, Xiang
AU - Xie, Xingxing
AU - Han, Junwei
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Object detection in remote sensing images has garnered significant attention due to its wide applications in real-world scenarios. However, most existing oriented object detectors still suffer from complex backgrounds and varying angles, limiting their performance to further improvement. In this paper, we propose a novel oriented detector with Hierarchical mask prompting and Robust integrated regression, termed HRDet. Specifically, to cope with the first issue, we construct a hierarchical mask prompting module consisting of a semantic mask prediction branch and hierarchical Softmax technique. The former aims to isolate object instances from cluttered interferences guided by coarse box-wise masks, while the latter propagates differentiated features for adjacent layers using hierarchical attentive weights. To deal with the second issue, we strive for robust integrated regression and formulate an efficient oriented IoU loss, explicitly measuring the discrepancies of three geometric factors in oriented regression, i.e., the central point distance, side length, and angle. This innovative loss intends to overcome the problem that existing IoU-based losses are invariant during the regression of varying angles. We applied these two strategies to a simple one-stage detection pipeline, achieving a new level of trade-off between speed and accuracy. Extensive experiments on four large aerial imagery datasets, DOTA-v1.0, DOTA-v2.0, DIOR-R, and HRSC2016, demonstrate that our HRDet significantly improves the accuracy of the one-stage detector over refine-stage counterparts while maintaining the efficiency advantage. The source code will be available at https://github.com/yanqingyao1994/HRDet.
AB - Object detection in remote sensing images has garnered significant attention due to its wide applications in real-world scenarios. However, most existing oriented object detectors still suffer from complex backgrounds and varying angles, limiting their performance to further improvement. In this paper, we propose a novel oriented detector with Hierarchical mask prompting and Robust integrated regression, termed HRDet. Specifically, to cope with the first issue, we construct a hierarchical mask prompting module consisting of a semantic mask prediction branch and hierarchical Softmax technique. The former aims to isolate object instances from cluttered interferences guided by coarse box-wise masks, while the latter propagates differentiated features for adjacent layers using hierarchical attentive weights. To deal with the second issue, we strive for robust integrated regression and formulate an efficient oriented IoU loss, explicitly measuring the discrepancies of three geometric factors in oriented regression, i.e., the central point distance, side length, and angle. This innovative loss intends to overcome the problem that existing IoU-based losses are invariant during the regression of varying angles. We applied these two strategies to a simple one-stage detection pipeline, achieving a new level of trade-off between speed and accuracy. Extensive experiments on four large aerial imagery datasets, DOTA-v1.0, DOTA-v2.0, DIOR-R, and HRSC2016, demonstrate that our HRDet significantly improves the accuracy of the one-stage detector over refine-stage counterparts while maintaining the efficiency advantage. The source code will be available at https://github.com/yanqingyao1994/HRDet.
KW - Oriented object detector
KW - efficient oriented IoU loss
KW - hierarchical mask prompting
KW - remote sensing image
KW - robust integrated regression
KW - semantic mask
UR - http://www.scopus.com/inward/record.url?scp=85201450581&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2024.3444795
DO - 10.1109/TCSVT.2024.3444795
M3 - 文章
AN - SCOPUS:85201450581
SN - 1051-8215
VL - 34
SP - 13071
EP - 13084
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 12
ER -