TY - JOUR
T1 - Modal Feature Disentanglement and Contribution Estimation for Multimodality Image Fusion
AU - Zhang, Tao
AU - Yang, Xiaogang
AU - Lu, Ruitao
AU - Zhang, Dingwen
AU - Xie, Xueli
AU - Zhu, Zhengjie
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Multimodality image fusion (MMIF) tasks aim at fusing complementary information from different modalities, e.g., salient objects and texture details, to improve image quality and information comprehensiveness. Most current MMIF methods adopt a "black-box"decoder to generate fused images, which leads to insufficient interpretability and difficulty in training. To deal with these problems, we convert MMIF into a modality contribution estimation task and propose a novel self-supervised fusion network based on modal feature disentanglement and contribution estimation, named MFDCE-Fuse. First, we construct a contrast-learning autoencoder, which seamlessly integrates the strengths of CNN and Swin Transformer to capture the long-range global features and local texture details and designs the contrastive reconstruction loss to promote the uniqueness and nonredundancy of the captured features. Second, considering that modal redundant features interfere with modal contribution estimation, we propose a feature disentangled representation framework based on contrastive constraint for obtaining modal-common and modal-private features. The contribution of modal images to the MMIF is evaluated through the proportion of modal-private features, which enhances the interpretability of the fusion process and image quality of the fused image. Furthermore, an innovative weighted perceptual loss and feature disentanglement contrastive loss are constructed to guarantee that the private feature remains intact. Qualitative and quantitative experimental results demonstrate the applicability and generalization of MFDCE-Fuse across multiple fusion tasks involving visible infrared (VIF) and medical image fusion (MIF).
AB - Multimodality image fusion (MMIF) tasks aim at fusing complementary information from different modalities, e.g., salient objects and texture details, to improve image quality and information comprehensiveness. Most current MMIF methods adopt a "black-box"decoder to generate fused images, which leads to insufficient interpretability and difficulty in training. To deal with these problems, we convert MMIF into a modality contribution estimation task and propose a novel self-supervised fusion network based on modal feature disentanglement and contribution estimation, named MFDCE-Fuse. First, we construct a contrast-learning autoencoder, which seamlessly integrates the strengths of CNN and Swin Transformer to capture the long-range global features and local texture details and designs the contrastive reconstruction loss to promote the uniqueness and nonredundancy of the captured features. Second, considering that modal redundant features interfere with modal contribution estimation, we propose a feature disentangled representation framework based on contrastive constraint for obtaining modal-common and modal-private features. The contribution of modal images to the MMIF is evaluated through the proportion of modal-private features, which enhances the interpretability of the fusion process and image quality of the fused image. Furthermore, an innovative weighted perceptual loss and feature disentanglement contrastive loss are constructed to guarantee that the private feature remains intact. Qualitative and quantitative experimental results demonstrate the applicability and generalization of MFDCE-Fuse across multiple fusion tasks involving visible infrared (VIF) and medical image fusion (MIF).
KW - Contrastive learning
KW - contribution estimation
KW - disentangled representation
KW - feature disentanglement
KW - image fusion
UR - http://www.scopus.com/inward/record.url?scp=105001073997&partnerID=8YFLogxK
U2 - 10.1109/TIM.2025.3545534
DO - 10.1109/TIM.2025.3545534
M3 - 文章
AN - SCOPUS:105001073997
SN - 0018-9456
VL - 74
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 5012416
ER -