TY - JOUR
T1 - Equivariant Multi-Modality Image Fusion
AU - Zhao, Zixiang
AU - Bai, Haowen
AU - Zhang, Jiangshe
AU - Zhang, Yulun
AU - Zhang, Kai
AU - Xu, Shuang
AU - Chen, Dongdong
AU - Timofte, Radu
AU - Van Gool, Luc
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Multi-modality image fusion is a technique that combines information from different sensors or modalities, en-abling the fused image to retain complementary features from each modality, such as functional highlights and texture details. However, effective training of such fusion models is challenging due to the scarcity of ground truth fusion data. To tackle this issue, we propose the Equivariant Multi-Modality imAge fusion (EMMA) paradigm for end-to-end self-supervised learning. Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations. Consequently, we introduce a novel training paradigm that encompasses a fusion module, a pseudo-sensing module, and an equivariant fusion module. These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior. Extensive experiments confirm that EMMA yields high-quality fusion results for infraredvisible and medical images, concurrently facilitating downstream multi-modal segmentation and detection tasks.
AB - Multi-modality image fusion is a technique that combines information from different sensors or modalities, en-abling the fused image to retain complementary features from each modality, such as functional highlights and texture details. However, effective training of such fusion models is challenging due to the scarcity of ground truth fusion data. To tackle this issue, we propose the Equivariant Multi-Modality imAge fusion (EMMA) paradigm for end-to-end self-supervised learning. Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations. Consequently, we introduce a novel training paradigm that encompasses a fusion module, a pseudo-sensing module, and an equivariant fusion module. These components enable the net training to follow the principles of the natural sensing-imaging process while satisfying the equivariant imaging prior. Extensive experiments confirm that EMMA yields high-quality fusion results for infraredvisible and medical images, concurrently facilitating downstream multi-modal segmentation and detection tasks.
KW - image fusion
KW - low-level vision
UR - http://www.scopus.com/inward/record.url?scp=85218046661&partnerID=8YFLogxK
U2 - 10.1109/CVPR52733.2024.02448
DO - 10.1109/CVPR52733.2024.02448
M3 - 会议文章
AN - SCOPUS:85218046661
SN - 1063-6919
SP - 25912
EP - 25921
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Y2 - 16 June 2024 through 22 June 2024
ER -