TY - JOUR
T1 - EDADet
T2 - Encoder-Decoder Domain Augmented Alignment Detector for Tiny Objects in Remote Sensing Images
AU - Tao, Wenguang
AU - Wang, Xiaotian
AU - Yan, Tian
AU - Bi, Haixia
AU - Yan, Jie
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - In recent years, deep learning has shown great potential in object detection applications, but it is still difficult to accurately detect tiny objects with an area proportion of less than 1% in remote sensing images. Most existing studies focus on designing complex networks to learn discriminative features of tiny objects, usually resulting in a heavy computational burden. In contrast, this paper proposes an accurate and efficient single-stage detector called EDADet for tiny objects. First, domain conversion technology is employed to realize cross-domain multimodal data fusion based on single-modal data input. Then, a tiny object-aware backbone is designed to extract features at different scales. Next, an encoder-decoder feature fusion structure is devised to achieve efficient cross-scale propagation of semantic information. Finally, a center-assist loss and an alignment self-supervised loss are adopted to alleviate the position sensitivity issue and drift of tiny objects. A series of experiments on the AI-TODv2 dataset demonstrate the effectiveness and practicality of our EDADet. It achieves state-of-the-art performance and surpasses the second-best method by 9.65% in AP50 and 4.86% in mAP.
AB - In recent years, deep learning has shown great potential in object detection applications, but it is still difficult to accurately detect tiny objects with an area proportion of less than 1% in remote sensing images. Most existing studies focus on designing complex networks to learn discriminative features of tiny objects, usually resulting in a heavy computational burden. In contrast, this paper proposes an accurate and efficient single-stage detector called EDADet for tiny objects. First, domain conversion technology is employed to realize cross-domain multimodal data fusion based on single-modal data input. Then, a tiny object-aware backbone is designed to extract features at different scales. Next, an encoder-decoder feature fusion structure is devised to achieve efficient cross-scale propagation of semantic information. Finally, a center-assist loss and an alignment self-supervised loss are adopted to alleviate the position sensitivity issue and drift of tiny objects. A series of experiments on the AI-TODv2 dataset demonstrate the effectiveness and practicality of our EDADet. It achieves state-of-the-art performance and surpasses the second-best method by 9.65% in AP50 and 4.86% in mAP.
KW - cross-domain multi-modality
KW - encoder-decoder feature fusion
KW - loss function
KW - Remote sensing image
KW - tiny object detection
UR - http://www.scopus.com/inward/record.url?scp=85211218286&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3510948
DO - 10.1109/TGRS.2024.3510948
M3 - 文章
AN - SCOPUS:85211218286
SN - 0196-2892
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
ER -