TY - GEN
T1 - Infrared and Visible Image Fusion Model Based on Wavelet-Convolution and Transformer
AU - Chen, Yanwei
AU - Zheng, Lihan
AU - Wu, Yapeng
AU - Yang, Chen
AU - Guo, Haoyan
AU - Shi, Wentao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Balancing local details and global structures remains a challenge in dual-spectrum image fusion, which integrates complementary information from infrared (IR) and visible (VI) sources. To address this, we propose WaveTransnet, a novel dual-branch network synergizing wavelet transforms and Transformers. The network processes IR and VI inputs through parallel branches. Within each branch, both wavelet convolutions and Transformer blocks are employed to concurrently extract high-frequency (HF) features capturing fine details and lowfrequency (LF) features representing structural context. An intramodal channel-spatial attention module then adaptively integrates these distinct HF and LF features derived from both the wavelet and Transformer paths within each modality (IR and VI separately), generating enhanced modality-specific HF and LF representations. Subsequently, cross-modal fusion merges the corresponding frequency components: the enhanced HF features from IR and VI are fused, and separately, the enhanced LF features are fused. Finally, the fused HF and LF representations are reconstructed into the final output image. Extensive experiments on the TNO datasets demonstrate that WaveTransnet achieves state-of-the-art performance, surpassing existing methods across multiple objective metrics (including EN, MI, SF, AG, SD, VIF) and subjective visual quality. Notably, the model effectively preserves detailed background textures from VI images while retaining salient thermal targets from IR images, highlighting its strong potential for practical applications by effectively leveraging frequency-specific information from both wavelet and Transformer perspectives.
AB - Balancing local details and global structures remains a challenge in dual-spectrum image fusion, which integrates complementary information from infrared (IR) and visible (VI) sources. To address this, we propose WaveTransnet, a novel dual-branch network synergizing wavelet transforms and Transformers. The network processes IR and VI inputs through parallel branches. Within each branch, both wavelet convolutions and Transformer blocks are employed to concurrently extract high-frequency (HF) features capturing fine details and lowfrequency (LF) features representing structural context. An intramodal channel-spatial attention module then adaptively integrates these distinct HF and LF features derived from both the wavelet and Transformer paths within each modality (IR and VI separately), generating enhanced modality-specific HF and LF representations. Subsequently, cross-modal fusion merges the corresponding frequency components: the enhanced HF features from IR and VI are fused, and separately, the enhanced LF features are fused. Finally, the fused HF and LF representations are reconstructed into the final output image. Extensive experiments on the TNO datasets demonstrate that WaveTransnet achieves state-of-the-art performance, surpassing existing methods across multiple objective metrics (including EN, MI, SF, AG, SD, VIF) and subjective visual quality. Notably, the model effectively preserves detailed background textures from VI images while retaining salient thermal targets from IR images, highlighting its strong potential for practical applications by effectively leveraging frequency-specific information from both wavelet and Transformer perspectives.
KW - Dual-Branch Architecture
KW - Image Fusion
KW - Transformer
KW - Wavelet Convolution
UR - https://www.scopus.com/pages/publications/105018741796
U2 - 10.1109/ISAEECE66033.2025.11160080
DO - 10.1109/ISAEECE66033.2025.11160080
M3 - 会议稿件
AN - SCOPUS:105018741796
T3 - 2025 10th International Symposium on Advances in Electrical, Electronics and Computer Engineering, ISAEECE 2025
SP - 268
EP - 272
BT - 2025 10th International Symposium on Advances in Electrical, Electronics and Computer Engineering, ISAEECE 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Symposium on Advances in Electrical, Electronics and Computer Engineering, ISAEECE 2025
Y2 - 20 June 2025 through 22 June 2025
ER -