TY - GEN
T1 - Multimodal Dual-domain Learning for Image Fusion
AU - Wang, Heng
AU - Jin, Mingxin
AU - Wang, Cong
AU - Yuan, Yuan
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Multimodal image fusion aims to generate high-resolution hyperspectral images by leveraging the complementary characteristics of spatially and spectrally high-resolution data. However, most existing approaches focus solely on fusion in a spatial domain, while neglecting the potential of frequency-domain information. To address this limitation, this paper proposes a dual-domain learning network that effectively integrates multimodal information from both spatial and frequency domains. In order to explore spatial and frequency domain information, a core module is customized for image fusion, which called the dual-domain fusion module. It consists of two branches that are the spatial domain branch and the frequency domain branch. In the frequency domain branch, the phase and amplitude information of different modes are explored to achieve multi-modal information fusion in the frequency domain. The fusion of dual-domain information helps the model to mine richer context information and improve the detailed reasoning ability of the multi-modal fusion model. Experimental results on two public datasets show that the performance of the proposed network is better than those of other peers.
AB - Multimodal image fusion aims to generate high-resolution hyperspectral images by leveraging the complementary characteristics of spatially and spectrally high-resolution data. However, most existing approaches focus solely on fusion in a spatial domain, while neglecting the potential of frequency-domain information. To address this limitation, this paper proposes a dual-domain learning network that effectively integrates multimodal information from both spatial and frequency domains. In order to explore spatial and frequency domain information, a core module is customized for image fusion, which called the dual-domain fusion module. It consists of two branches that are the spatial domain branch and the frequency domain branch. In the frequency domain branch, the phase and amplitude information of different modes are explored to achieve multi-modal information fusion in the frequency domain. The fusion of dual-domain information helps the model to mine richer context information and improve the detailed reasoning ability of the multi-modal fusion model. Experimental results on two public datasets show that the performance of the proposed network is better than those of other peers.
KW - Deep Learning
KW - Frequency domain analysis
KW - Multimodal image fusion
KW - Remote Sensing
KW - Super-resolution
UR - https://www.scopus.com/pages/publications/105035174640
U2 - 10.1109/ICCVW69036.2025.00678
DO - 10.1109/ICCVW69036.2025.00678
M3 - 会议稿件
AN - SCOPUS:105035174640
T3 - Proceedings - 2025 IEEE/CVF International Conference on Computer Vision Workshops, ICCV-W 2025
SP - 6545
EP - 6554
BT - Proceedings - 2025 IEEE/CVF International Conference on Computer Vision Workshops, ICCV-W 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE/CVF International Conference on Computer Vision Workshops, ICCV-W 2025
Y2 - 19 October 2025 through 20 October 2025
ER -