Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images

Gong Cheng; Yongjie Si; Hailong Hong; Xiwen Yao; Lei Guo

doi:10.1109/LGRS.2020.2975541

Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images

Gong Cheng, Yongjie Si, Hailong Hong, Xiwen Yao, Lei Guo

School of Automation

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

209 Scopus citations

Abstract

For the time being, there are many groundbreaking object detection frameworks used in natural scene images. These algorithms have good detection performance on the data sets of open natural scenes. However, applying these frameworks to remote sensing images directly is not very effective. The existing deep-learning-based object detection algorithms still face some challenges when dealing with remote sensing images because these images usually contain a number of targets with large variations of object sizes as well as interclass similarity. Aiming at the challenges of object detection in optical remote sensing images, we propose an end-to-end cross-scale feature fusion (CSFF) framework, which can effectively improve the object detection accuracy. Specifically, we first use a feature pyramid network (FPN) to obtain multilevel feature maps and then insert a squeeze and excitation (SE) block into the top layer to model the relationship between different feature channels. Next, we use the CSFF module to obtain powerful and discriminative multilevel feature representations. Finally, we implement our work in the framework of Faster region-based CNN (R-CNN). In the experiment, we evaluate our method on a publicly available large-scale data set, named DIOR, and obtain an improvement of 3.0% measured in terms of mAP compared with Faster R-CNN with FPN.

Original language	English
Article number	9024005
Pages (from-to)	431-435
Number of pages	5
Journal	IEEE Geoscience and Remote Sensing Letters
Volume	18
Issue number	3
DOIs	https://doi.org/10.1109/LGRS.2020.2975541
State	Published - Mar 2021

Keywords

Convolutional neural networks (CNNs)
cross-scale feature fusion (CSFF)
object detection
remote sensing images

Access to Document

10.1109/LGRS.2020.2975541

Cite this

@article{f58085e0bab54903bab984dbd6b1d915,

title = "Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images",

abstract = "For the time being, there are many groundbreaking object detection frameworks used in natural scene images. These algorithms have good detection performance on the data sets of open natural scenes. However, applying these frameworks to remote sensing images directly is not very effective. The existing deep-learning-based object detection algorithms still face some challenges when dealing with remote sensing images because these images usually contain a number of targets with large variations of object sizes as well as interclass similarity. Aiming at the challenges of object detection in optical remote sensing images, we propose an end-to-end cross-scale feature fusion (CSFF) framework, which can effectively improve the object detection accuracy. Specifically, we first use a feature pyramid network (FPN) to obtain multilevel feature maps and then insert a squeeze and excitation (SE) block into the top layer to model the relationship between different feature channels. Next, we use the CSFF module to obtain powerful and discriminative multilevel feature representations. Finally, we implement our work in the framework of Faster region-based CNN (R-CNN). In the experiment, we evaluate our method on a publicly available large-scale data set, named DIOR, and obtain an improvement of 3.0% measured in terms of mAP compared with Faster R-CNN with FPN.",

keywords = "Convolutional neural networks (CNNs), cross-scale feature fusion (CSFF), object detection, remote sensing images",

author = "Gong Cheng and Yongjie Si and Hailong Hong and Xiwen Yao and Lei Guo",

note = "Publisher Copyright: {\textcopyright} 2004-2012 IEEE.",

year = "2021",

month = mar,

doi = "10.1109/LGRS.2020.2975541",

language = "英语",

volume = "18",

pages = "431--435",

journal = "IEEE Geoscience and Remote Sensing Letters",

issn = "1545-598X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images

AU - Cheng, Gong

AU - Si, Yongjie

AU - Hong, Hailong

AU - Yao, Xiwen

AU - Guo, Lei

PY - 2021/3

Y1 - 2021/3

N2 - For the time being, there are many groundbreaking object detection frameworks used in natural scene images. These algorithms have good detection performance on the data sets of open natural scenes. However, applying these frameworks to remote sensing images directly is not very effective. The existing deep-learning-based object detection algorithms still face some challenges when dealing with remote sensing images because these images usually contain a number of targets with large variations of object sizes as well as interclass similarity. Aiming at the challenges of object detection in optical remote sensing images, we propose an end-to-end cross-scale feature fusion (CSFF) framework, which can effectively improve the object detection accuracy. Specifically, we first use a feature pyramid network (FPN) to obtain multilevel feature maps and then insert a squeeze and excitation (SE) block into the top layer to model the relationship between different feature channels. Next, we use the CSFF module to obtain powerful and discriminative multilevel feature representations. Finally, we implement our work in the framework of Faster region-based CNN (R-CNN). In the experiment, we evaluate our method on a publicly available large-scale data set, named DIOR, and obtain an improvement of 3.0% measured in terms of mAP compared with Faster R-CNN with FPN.

AB - For the time being, there are many groundbreaking object detection frameworks used in natural scene images. These algorithms have good detection performance on the data sets of open natural scenes. However, applying these frameworks to remote sensing images directly is not very effective. The existing deep-learning-based object detection algorithms still face some challenges when dealing with remote sensing images because these images usually contain a number of targets with large variations of object sizes as well as interclass similarity. Aiming at the challenges of object detection in optical remote sensing images, we propose an end-to-end cross-scale feature fusion (CSFF) framework, which can effectively improve the object detection accuracy. Specifically, we first use a feature pyramid network (FPN) to obtain multilevel feature maps and then insert a squeeze and excitation (SE) block into the top layer to model the relationship between different feature channels. Next, we use the CSFF module to obtain powerful and discriminative multilevel feature representations. Finally, we implement our work in the framework of Faster region-based CNN (R-CNN). In the experiment, we evaluate our method on a publicly available large-scale data set, named DIOR, and obtain an improvement of 3.0% measured in terms of mAP compared with Faster R-CNN with FPN.

KW - Convolutional neural networks (CNNs)

KW - cross-scale feature fusion (CSFF)

KW - object detection

KW - remote sensing images

UR - http://www.scopus.com/inward/record.url?scp=85101858852&partnerID=8YFLogxK

U2 - 10.1109/LGRS.2020.2975541

DO - 10.1109/LGRS.2020.2975541

M3 - 文章

AN - SCOPUS:85101858852

SN - 1545-598X

VL - 18

SP - 431

EP - 435

JO - IEEE Geoscience and Remote Sensing Letters

JF - IEEE Geoscience and Remote Sensing Letters

IS - 3

M1 - 9024005

ER -

Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this