TY - GEN
T1 - Enhancing Multimodal Fusion with only Unimodal Data
AU - Han, Wenqi
AU - Geng, Jie
AU - Deng, Xinyang
AU - Jiang, Wen
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.
AB - With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.
KW - multimodal fusion
KW - remote sensing
KW - semantic segmentation
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85204922172&partnerID=8YFLogxK
U2 - 10.1109/IGARSS53475.2024.10641451
DO - 10.1109/IGARSS53475.2024.10641451
M3 - 会议稿件
AN - SCOPUS:85204922172
T3 - International Geoscience and Remote Sensing Symposium (IGARSS)
SP - 2962
EP - 2965
BT - IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
Y2 - 7 July 2024 through 12 July 2024
ER -