A light-weight, efficient, and general cross-modal image fusion network

Aiqing Fang; Xinbo Zhao; Jiaqi Yang; Beibei Qin; Yanning Zhang

doi:10.1016/j.neucom.2021.08.044

A light-weight, efficient, and general cross-modal image fusion network

Aiqing Fang, Xinbo Zhao, Jiaqi Yang, Beibei Qin, Yanning Zhang

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

22 引用（Scopus）

摘要

Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.

源语言	英语
页（从-至）	198-211
页数	14
期刊	Neurocomputing
卷	463
DOI	https://doi.org/10.1016/j.neucom.2021.08.044
出版状态	已出版 - 6 11月 2021

访问文件

10.1016/j.neucom.2021.08.044

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{8b4f9b2bbf504ed786f9243c57f95382,

title = "A light-weight, efficient, and general cross-modal image fusion network",

abstract = "Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.",

keywords = "Collaborative optimization, Deep learning, Image fusion, Image quality, Optimization",

author = "Aiqing Fang and Xinbo Zhao and Jiaqi Yang and Beibei Qin and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier B.V.",

year = "2021",

month = nov,

day = "6",

doi = "10.1016/j.neucom.2021.08.044",

language = "英语",

volume = "463",

pages = "198--211",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A light-weight, efficient, and general cross-modal image fusion network

AU - Fang, Aiqing

AU - Zhao, Xinbo

AU - Yang, Jiaqi

AU - Qin, Beibei

AU - Zhang, Yanning

PY - 2021/11/6

Y1 - 2021/11/6

N2 - Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.

AB - Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.

KW - Collaborative optimization

KW - Deep learning

KW - Image fusion

KW - Image quality

KW - Optimization

UR - http://www.scopus.com/inward/record.url?scp=85113325587&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2021.08.044

DO - 10.1016/j.neucom.2021.08.044

M3 - 文章

AN - SCOPUS:85113325587

SN - 0925-2312

VL - 463

SP - 198

EP - 211

JO - Neurocomputing

JF - Neurocomputing

ER -

A light-weight, efficient, and general cross-modal image fusion network

摘要

访问文件

其它文件与链接

指纹

引用此