MTDFusion: A Multilayer Triple Dense Network for Infrared and Visible Image Fusion

Shahid Karim; Geng Tong; Jinyang Li; Xiaochang Yu; Jia Hao; Akeel Qadir; Yiting Yu

doi:10.1109/TIM.2023.3329148

MTDFusion: A Multilayer Triple Dense Network for Infrared and Visible Image Fusion

Shahid Karim, Geng Tong, Jinyang Li, Xiaochang Yu, Jia Hao, Akeel Qadir, Yiting Yu

School of Mechanical Engineering

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

The ultimate image with important information and better visuals can be obtained by fusing infrared (IR) and visual images. Several fusion methods consist of complex networks to derive the finest parameters from different source images and do not exploit reliable and salient information completely. To avoid the limitations of information loss and fusion of feature information, we have proposed a multilayer triple dense network (DN) for IR and visible (VIS) image fusion named MTDFusion, which completely exploits salient features and utilizes the residual features from the combination of input images adaptively. The encoding network is designed as a triple network of dense blocks that takes inputs as IR, VIS, and the difference extracted from IR and VIS images. The fusion layer exploits the weighted combination of input images, and decoding layers work conventionally. The results obtained by our proposed approach reveal optimal image details and higher contrast. We have improved the loss function by combining smooth-L1 and structured similarity index (SSIM) loss functions for our proposed method. In addition, the different variations of λs in SSIM are examined using qualitative and quantitative assessments. The results achieve optimal fusion performance compared to several state-of-the-art (SOTA) methods. The source code for MTDFusion is available at https://github.com/tgg-77/MTDFusion.

Original language	English
Article number	5010117
Pages (from-to)	1-17
Number of pages	17
Journal	IEEE Transactions on Instrumentation and Measurement
Volume	73
DOIs	https://doi.org/10.1109/TIM.2023.3329148
State	Published - 2024

Keywords

Dense network (DN)
difference image (DI)
image fusion
infrared (IR) image
visible (VIS) image

Access to Document

10.1109/TIM.2023.3329148

Cite this

@article{5b4406397f5b46d9bb48dcb2c836a416,

title = "MTDFusion: A Multilayer Triple Dense Network for Infrared and Visible Image Fusion",

abstract = "The ultimate image with important information and better visuals can be obtained by fusing infrared (IR) and visual images. Several fusion methods consist of complex networks to derive the finest parameters from different source images and do not exploit reliable and salient information completely. To avoid the limitations of information loss and fusion of feature information, we have proposed a multilayer triple dense network (DN) for IR and visible (VIS) image fusion named MTDFusion, which completely exploits salient features and utilizes the residual features from the combination of input images adaptively. The encoding network is designed as a triple network of dense blocks that takes inputs as IR, VIS, and the difference extracted from IR and VIS images. The fusion layer exploits the weighted combination of input images, and decoding layers work conventionally. The results obtained by our proposed approach reveal optimal image details and higher contrast. We have improved the loss function by combining smooth-L1 and structured similarity index (SSIM) loss functions for our proposed method. In addition, the different variations of λs in SSIM are examined using qualitative and quantitative assessments. The results achieve optimal fusion performance compared to several state-of-the-art (SOTA) methods. The source code for MTDFusion is available at https://github.com/tgg-77/MTDFusion.",

keywords = "Dense network (DN), difference image (DI), image fusion, infrared (IR) image, visible (VIS) image",

author = "Shahid Karim and Geng Tong and Jinyang Li and Xiaochang Yu and Jia Hao and Akeel Qadir and Yiting Yu",

note = "Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2024",

doi = "10.1109/TIM.2023.3329148",

language = "英语",

volume = "73",

pages = "1--17",

journal = "IEEE Transactions on Instrumentation and Measurement",

issn = "0018-9456",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - MTDFusion

T2 - A Multilayer Triple Dense Network for Infrared and Visible Image Fusion

AU - Karim, Shahid

AU - Tong, Geng

AU - Li, Jinyang

AU - Yu, Xiaochang

AU - Hao, Jia

AU - Qadir, Akeel

AU - Yu, Yiting

PY - 2024

Y1 - 2024

N2 - The ultimate image with important information and better visuals can be obtained by fusing infrared (IR) and visual images. Several fusion methods consist of complex networks to derive the finest parameters from different source images and do not exploit reliable and salient information completely. To avoid the limitations of information loss and fusion of feature information, we have proposed a multilayer triple dense network (DN) for IR and visible (VIS) image fusion named MTDFusion, which completely exploits salient features and utilizes the residual features from the combination of input images adaptively. The encoding network is designed as a triple network of dense blocks that takes inputs as IR, VIS, and the difference extracted from IR and VIS images. The fusion layer exploits the weighted combination of input images, and decoding layers work conventionally. The results obtained by our proposed approach reveal optimal image details and higher contrast. We have improved the loss function by combining smooth-L1 and structured similarity index (SSIM) loss functions for our proposed method. In addition, the different variations of λs in SSIM are examined using qualitative and quantitative assessments. The results achieve optimal fusion performance compared to several state-of-the-art (SOTA) methods. The source code for MTDFusion is available at https://github.com/tgg-77/MTDFusion.

AB - The ultimate image with important information and better visuals can be obtained by fusing infrared (IR) and visual images. Several fusion methods consist of complex networks to derive the finest parameters from different source images and do not exploit reliable and salient information completely. To avoid the limitations of information loss and fusion of feature information, we have proposed a multilayer triple dense network (DN) for IR and visible (VIS) image fusion named MTDFusion, which completely exploits salient features and utilizes the residual features from the combination of input images adaptively. The encoding network is designed as a triple network of dense blocks that takes inputs as IR, VIS, and the difference extracted from IR and VIS images. The fusion layer exploits the weighted combination of input images, and decoding layers work conventionally. The results obtained by our proposed approach reveal optimal image details and higher contrast. We have improved the loss function by combining smooth-L1 and structured similarity index (SSIM) loss functions for our proposed method. In addition, the different variations of λs in SSIM are examined using qualitative and quantitative assessments. The results achieve optimal fusion performance compared to several state-of-the-art (SOTA) methods. The source code for MTDFusion is available at https://github.com/tgg-77/MTDFusion.

KW - Dense network (DN)

KW - difference image (DI)

KW - image fusion

KW - infrared (IR) image

KW - visible (VIS) image

UR - http://www.scopus.com/inward/record.url?scp=85181581622&partnerID=8YFLogxK

U2 - 10.1109/TIM.2023.3329148

DO - 10.1109/TIM.2023.3329148

M3 - 文章

AN - SCOPUS:85181581622

SN - 0018-9456

VL - 73

SP - 1

EP - 17

JO - IEEE Transactions on Instrumentation and Measurement

JF - IEEE Transactions on Instrumentation and Measurement

M1 - 5010117

ER -

MTDFusion: A Multilayer Triple Dense Network for Infrared and Visible Image Fusion

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this