Abstract
The ultimate image with important information and better visuals can be obtained by fusing infrared (IR) and visual images. Several fusion methods consist of complex networks to derive the finest parameters from different source images and do not exploit reliable and salient information completely. To avoid the limitations of information loss and fusion of feature information, we have proposed a multilayer triple dense network (DN) for IR and visible (VIS) image fusion named MTDFusion, which completely exploits salient features and utilizes the residual features from the combination of input images adaptively. The encoding network is designed as a triple network of dense blocks that takes inputs as IR, VIS, and the difference extracted from IR and VIS images. The fusion layer exploits the weighted combination of input images, and decoding layers work conventionally. The results obtained by our proposed approach reveal optimal image details and higher contrast. We have improved the loss function by combining smooth-L1 and structured similarity index (SSIM) loss functions for our proposed method. In addition, the different variations of λs in SSIM are examined using qualitative and quantitative assessments. The results achieve optimal fusion performance compared to several state-of-the-art (SOTA) methods. The source code for MTDFusion is available at https://github.com/tgg-77/MTDFusion.
Original language | English |
---|---|
Article number | 5010117 |
Pages (from-to) | 1-17 |
Number of pages | 17 |
Journal | IEEE Transactions on Instrumentation and Measurement |
Volume | 73 |
DOIs | |
State | Published - 2024 |
Keywords
- Dense network (DN)
- difference image (DI)
- image fusion
- infrared (IR) image
- visible (VIS) image