TY - JOUR
T1 - Incorporating Multiscale Context and Task-Consistent Focal Loss into Oriented Object Detection
AU - Qian, Xiaoliang
AU - Jian, Qingqing
AU - Wang, Wei
AU - Yao, Xiwen
AU - Cheng, Gong
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Oriented object detection (OOD) in remote sensing images (RSIs) aims to precisely localize and identify objects with arbitrary orientations. Two-stage OOD methods attract lots of interest due to their superior accuracy; however, they still face two major problems. First, the misclassification problem frequently occurs because the majority of classification strategies solely rely on the features of proposals. Second, most loss functions cannot simultaneously concentrate on hard samples and boost the consistency between identification and localization, which restricts the further improvement of OOD models. To address the first problem, multiscale contextual information is incorporated into a two-stage OOD model in this article. Specifically, N contextual branches are added to predict the class confidence score (CCS) of each proposal and its N enlarged proposals which include multiscale context, and the final CCS of each proposal is determined by the mean value of the above N + 1 CCSs. To tackle the second problem, a task-consistent focal (TF) loss is proposed. The TF loss employs the difficulty of localization as the weight of classification loss, and the difficulty of identification is used as the weight of regression loss. Concentrating on hard samples and synchronous optimization of classification and regression can be achieved by minimizing the TF loss. The ablation studies show the validity of the contextual information, TF, and their combination. The comparison with popular OOD models demonstrates the superior performance of our model on the DOTA and DIOR-R datasets.
AB - Oriented object detection (OOD) in remote sensing images (RSIs) aims to precisely localize and identify objects with arbitrary orientations. Two-stage OOD methods attract lots of interest due to their superior accuracy; however, they still face two major problems. First, the misclassification problem frequently occurs because the majority of classification strategies solely rely on the features of proposals. Second, most loss functions cannot simultaneously concentrate on hard samples and boost the consistency between identification and localization, which restricts the further improvement of OOD models. To address the first problem, multiscale contextual information is incorporated into a two-stage OOD model in this article. Specifically, N contextual branches are added to predict the class confidence score (CCS) of each proposal and its N enlarged proposals which include multiscale context, and the final CCS of each proposal is determined by the mean value of the above N + 1 CCSs. To tackle the second problem, a task-consistent focal (TF) loss is proposed. The TF loss employs the difficulty of localization as the weight of classification loss, and the difficulty of identification is used as the weight of regression loss. Concentrating on hard samples and synchronous optimization of classification and regression can be achieved by minimizing the TF loss. The ablation studies show the validity of the contextual information, TF, and their combination. The comparison with popular OOD models demonstrates the superior performance of our model on the DOTA and DIOR-R datasets.
KW - Multiscale context (MSC)
KW - oriented object detection (OOD)
KW - remote sensing image (RSI)
KW - task-consistent focal (TF) loss
UR - http://www.scopus.com/inward/record.url?scp=105008670310&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2025.3580937
DO - 10.1109/TGRS.2025.3580937
M3 - 文章
AN - SCOPUS:105008670310
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5628411
ER -