TY - JOUR
T1 - A light-weight, efficient, and general cross-modal image fusion network
AU - Fang, Aiqing
AU - Zhao, Xinbo
AU - Yang, Jiaqi
AU - Qin, Beibei
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/11/6
Y1 - 2021/11/6
N2 - Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.
AB - Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.
KW - Collaborative optimization
KW - Deep learning
KW - Image fusion
KW - Image quality
KW - Optimization
UR - http://www.scopus.com/inward/record.url?scp=85113325587&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2021.08.044
DO - 10.1016/j.neucom.2021.08.044
M3 - 文章
AN - SCOPUS:85113325587
SN - 0925-2312
VL - 463
SP - 198
EP - 211
JO - Neurocomputing
JF - Neurocomputing
ER -