TY - JOUR
T1 - DDFN
T2 - Deblurring Dictionary Encoding Fusion Network for Infrared and Visible Image Object Detection
AU - Lai, Jiawei
AU - Geng, Jie
AU - Deng, Xinyang
AU - Jiang, Wen
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Both infrared and visible images have advantages for object detection, since infrared images (IRs) can capture thermal characteristics of objects and visible images can provide high spatial resolution and clear texture details of objects. Combining infrared and visible images for object detection has many advantages, but how to fully utilize the inherent characteristics of these two data is still a challenging issue. To address this issue, a deblurring dictionary encoding fusion network (DDFN) is proposed for infrared and visible image object detection. First, a dual-stream feature extraction backbone is structured, which aims to learn features based on the characteristics of different modalities. Then, pooling operations are applied to filter out key information and reduce the complexity of the network. Afterward, a fuzzy compensation module (FCM) is proposed, which aims to minimize the information loss of the pooling process. Finally, a dictionary encoding fusion module (DEFM) is proposed to robustly excavate potential interactions between infrared and visible images, which can obtain fusion features by aggregating the local information of infrared features and the long-term-dependent information of visible features. The proposed DDFN exhibits excellent performance on two benchmark bimodal datasets and shows superior capabilities in object detection of infrared-visible images.
AB - Both infrared and visible images have advantages for object detection, since infrared images (IRs) can capture thermal characteristics of objects and visible images can provide high spatial resolution and clear texture details of objects. Combining infrared and visible images for object detection has many advantages, but how to fully utilize the inherent characteristics of these two data is still a challenging issue. To address this issue, a deblurring dictionary encoding fusion network (DDFN) is proposed for infrared and visible image object detection. First, a dual-stream feature extraction backbone is structured, which aims to learn features based on the characteristics of different modalities. Then, pooling operations are applied to filter out key information and reduce the complexity of the network. Afterward, a fuzzy compensation module (FCM) is proposed, which aims to minimize the information loss of the pooling process. Finally, a dictionary encoding fusion module (DEFM) is proposed to robustly excavate potential interactions between infrared and visible images, which can obtain fusion features by aggregating the local information of infrared features and the long-term-dependent information of visible features. The proposed DDFN exhibits excellent performance on two benchmark bimodal datasets and shows superior capabilities in object detection of infrared-visible images.
KW - Dual-stream feature extraction
KW - feature fusion
KW - infrared image (IR)
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85170567346&partnerID=8YFLogxK
U2 - 10.1109/LGRS.2023.3311176
DO - 10.1109/LGRS.2023.3311176
M3 - 文章
AN - SCOPUS:85170567346
SN - 1545-598X
VL - 20
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
M1 - 6009705
ER -