TY - JOUR
T1 - DIMA
T2 - Digging into Multigranular Archetype for Fine-Grained Object Detection
AU - Cheng, Jiacheng
AU - Yao, Xiwen
AU - Yang, Xuguang
AU - Yuan, Xiang
AU - Feng, Xiaoxu
AU - Cheng, Gong
AU - Huang, Xiankai
AU - Han, Junwei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Fine-grained remote sensing object detection aims at precisely locating objects and determining the fine-level categories. This task is exceptionally challenging due to the substantial interclass similarity, presenting difficulties in capturing discriminative features. We attribute this to the absence of essential information that can serve as supervision for the learning. This involves comprehensive visual patterns of objects and intrinsic relationships of multigranular features. In this article, we propose a novel scheme dubbed as digging into the multigranular archetype (DIMA) for fine-grained remote sensing object detection. In detail, we first design a simple yet effective frequency-aware representation supplement (FARS) mechanism learning from original images and their auxiliary frequency counterparts simultaneously. The FARS introduces high-and low-frequency representations to reinforce a range of visual cues, such as particular regions associated with the former and contours of objects related to the latter. Then, we further devise a module named hierarchical classification paradigm (HCP), which constructs the interhierarchy relationships between coarse and fine-level representations and then exploits them to guide fine-grained feature enhancement. HCP eventually selects and boosts samples that are hard to discriminate by keeping consistency in multilevels. Our method can be easily integrated into prevailing oriented object detectors and brings consistent performance improvements across these detectors. Notably, our method combined with oriented RCNN (ORCNN) achieves 44.44% (+3.62%) on the FAIR1M and 91.0% (+6.9%) on the MAR20. Moreover, thoughtful discussions about qualitative results and rich visualizations are provided to intuitively underscore the superiority of our approach. The source code is available at https://github.com/chengjc2019/DIMA.
AB - Fine-grained remote sensing object detection aims at precisely locating objects and determining the fine-level categories. This task is exceptionally challenging due to the substantial interclass similarity, presenting difficulties in capturing discriminative features. We attribute this to the absence of essential information that can serve as supervision for the learning. This involves comprehensive visual patterns of objects and intrinsic relationships of multigranular features. In this article, we propose a novel scheme dubbed as digging into the multigranular archetype (DIMA) for fine-grained remote sensing object detection. In detail, we first design a simple yet effective frequency-aware representation supplement (FARS) mechanism learning from original images and their auxiliary frequency counterparts simultaneously. The FARS introduces high-and low-frequency representations to reinforce a range of visual cues, such as particular regions associated with the former and contours of objects related to the latter. Then, we further devise a module named hierarchical classification paradigm (HCP), which constructs the interhierarchy relationships between coarse and fine-level representations and then exploits them to guide fine-grained feature enhancement. HCP eventually selects and boosts samples that are hard to discriminate by keeping consistency in multilevels. Our method can be easily integrated into prevailing oriented object detectors and brings consistent performance improvements across these detectors. Notably, our method combined with oriented RCNN (ORCNN) achieves 44.44% (+3.62%) on the FAIR1M and 91.0% (+6.9%) on the MAR20. Moreover, thoughtful discussions about qualitative results and rich visualizations are provided to intuitively underscore the superiority of our approach. The source code is available at https://github.com/chengjc2019/DIMA.
KW - FAIR1M dataset
KW - feature enhancement
KW - fine-grained object detection
KW - remote sensing images
UR - http://www.scopus.com/inward/record.url?scp=85196524887&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3415809
DO - 10.1109/TGRS.2024.3415809
M3 - 文章
AN - SCOPUS:85196524887
SN - 0196-2892
VL - 62
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5628714
ER -