Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation

Jingyu Wang; Mingrui Ma; Pengfei Hang; Shaohui Mei; Liang Zhang; Hongmei Wang

doi:10.1109/JSTARS.2025.3543189

Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation

Jingyu Wang, Mingrui Ma, Pengfei Hang, Shaohui Mei, Liang Zhang, Hongmei Wang

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

Abstract

Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.

Original language	English
Journal	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
DOIs	https://doi.org/10.1109/JSTARS.2025.3543189
State	Accepted/In press - 2025

Keywords

global contextual information
multi-contextual information
multi-receptive field enhancement
Small object detection
sparse cross self-attention

Access to Document

10.1109/JSTARS.2025.3543189

Cite this

@article{9595e3b967d14eb6b7dfc9876ab4a207,

title = "Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation",

abstract = "Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.",

keywords = "global contextual information, multi-contextual information, multi-receptive field enhancement, Small object detection, sparse cross self-attention",

author = "Jingyu Wang and Mingrui Ma and Pengfei Hang and Shaohui Mei and Liang Zhang and Hongmei Wang",

note = "Publisher Copyright: {\textcopyright} 2008-2012 IEEE.",

year = "2025",

doi = "10.1109/JSTARS.2025.3543189",

language = "英语",

journal = "IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing",

issn = "1939-1404",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation

AU - Wang, Jingyu

AU - Ma, Mingrui

AU - Hang, Pengfei

AU - Mei, Shaohui

AU - Zhang, Liang

AU - Wang, Hongmei

PY - 2025

Y1 - 2025

N2 - Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.

AB - Due to wide field of view and background confusion, remote sensing objects are small and densely packed, commonly used detection methods detecting small object are not satisfactory. In this article, we propose method multi-contextual information aggregation YOLO (MCIA-YOLO), combining three novel modules to effectively aggregate multi-contextual information across channels, depths and pixels. Firstly, the channel-spatial information aggregation (CSA) module assembles spatial global features pursuant to channel contextual information, increasing the density of key information. Secondly, the shallow-deep information sparse aggregation (SDSA) module applies sparse cross self-attention mechanism. By sparsely correlating long-range dependency information across different regions, the representation capability of small target is enhanced while removing redundant information. Thirdly, to enrich local multi-scale features and better identify dense targets, multi-scale weighted aggregation (MWA) module convolves multi-receptive field information and performs weighted fusion. Our method demonstrates satisfactory performance on dataset VisDrone2019, UAVDT and NWPU VHR-10, especially in small objects detection, surpassing several state-of-the-art methods.

KW - global contextual information

KW - multi-contextual information

KW - multi-receptive field enhancement

KW - Small object detection

KW - sparse cross self-attention

UR - http://www.scopus.com/inward/record.url?scp=85218806196&partnerID=8YFLogxK

U2 - 10.1109/JSTARS.2025.3543189

DO - 10.1109/JSTARS.2025.3543189

M3 - 文章

AN - SCOPUS:85218806196

SN - 1939-1404

JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ER -

Remote Sensing Small Object Detection Based on Multi-Contextual Information Aggregation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this