TY - JOUR
T1 - Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation
AU - Wang, Jingyu
AU - Ma, Mingrui
AU - Hang, Pengfei
AU - Mei, Shaohui
AU - Zhang, Liang
AU - Wang, Hongmei
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.
AB - Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.
KW - global contextual information
KW - multi-contextual information
KW - multi-receptive field enhancement
KW - Small object detection
KW - sparse cross self-attention
UR - http://www.scopus.com/inward/record.url?scp=85218806196&partnerID=8YFLogxK
U2 - 10.1109/JSTARS.2025.3543189
DO - 10.1109/JSTARS.2025.3543189
M3 - 文章
AN - SCOPUS:85218806196
SN - 1939-1404
JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
ER -