TY - JOUR
T1 - Boosting Object Detectors via Strong-Classification Weak-Localization Pretraining in Remote Sensing Imagery
AU - Zhang, Cong
AU - Liu, Tianshan
AU - Xiao, Jun
AU - Lam, Kin Man
AU - Wang, Qi
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep learning-based object detectors in remote sensing (RS) scenarios typically follow the paradigm of pretraining and fine-tuning to alleviate the limitation of insufficient downstream data. Despite the improved performance, existing pretraining paradigms are suboptimal due to three deficiencies: 1) inconsistent domains, i.e., pretraining on natural scenes and fine-tuning for RS scenes; 2) mismatched task objectives, i.e., classification-oriented pretraining while detection-oriented fine-tuning; and 3) misaligned architectures, i.e., pretraining only one bare backbone yet neglecting other vital detection components. Against these issues, this article proposes a novel pretraining paradigm specifically for the task of RS object detection, namely, RS strong-classification weak-localization (SCWL) pretraining. Unlike conventional classification pretraining, such as the widely used ImageNet pretraining, our pretraining strategy can adaptively perform bounding box generation on a reconstructed large-scale RS classification-style dataset. These pseudobounding boxes are integrated with the original accurate class labels as location- and category-related supervisions, respectively, to pretrain the entire RS detectors. The proposed RS SCWL pretraining paradigm is able to significantly improve downstream detection performance and outperforms classification pretraining methods, including ImageNet pretraining. Extensive experiments on different object detection datasets demonstrate its effectiveness and superiority in boosting various RS detectors.
AB - Deep learning-based object detectors in remote sensing (RS) scenarios typically follow the paradigm of pretraining and fine-tuning to alleviate the limitation of insufficient downstream data. Despite the improved performance, existing pretraining paradigms are suboptimal due to three deficiencies: 1) inconsistent domains, i.e., pretraining on natural scenes and fine-tuning for RS scenes; 2) mismatched task objectives, i.e., classification-oriented pretraining while detection-oriented fine-tuning; and 3) misaligned architectures, i.e., pretraining only one bare backbone yet neglecting other vital detection components. Against these issues, this article proposes a novel pretraining paradigm specifically for the task of RS object detection, namely, RS strong-classification weak-localization (SCWL) pretraining. Unlike conventional classification pretraining, such as the widely used ImageNet pretraining, our pretraining strategy can adaptively perform bounding box generation on a reconstructed large-scale RS classification-style dataset. These pseudobounding boxes are integrated with the original accurate class labels as location- and category-related supervisions, respectively, to pretrain the entire RS detectors. The proposed RS SCWL pretraining paradigm is able to significantly improve downstream detection performance and outperforms classification pretraining methods, including ImageNet pretraining. Extensive experiments on different object detection datasets demonstrate its effectiveness and superiority in boosting various RS detectors.
KW - Object detection
KW - pretraining paradigms
KW - remote sensing (RS) imagery
KW - scene classification
KW - weakly supervised object localization (WSOL)
UR - http://www.scopus.com/inward/record.url?scp=85173002982&partnerID=8YFLogxK
U2 - 10.1109/TIM.2023.3315392
DO - 10.1109/TIM.2023.3315392
M3 - 文章
AN - SCOPUS:85173002982
SN - 0018-9456
VL - 72
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 5026520
ER -