Enhancing Multimodal Fusion with only Unimodal Data

Wenqi Han; Jie Geng; Xinyang Deng; Wen Jiang

doi:10.1109/IGARSS53475.2024.10641451

Enhancing Multimodal Fusion with only Unimodal Data

Wenqi Han, Jie Geng, Xinyang Deng, Wen Jiang

School of Electronics and Information

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.

Original language	English
Title of host publication	IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2962-2965
Number of pages	4
ISBN (Electronic)	9798350360325
DOIs	https://doi.org/10.1109/IGARSS53475.2024.10641451
State	Published - 2024
Event	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, Greece Duration: 7 Jul 2024 → 12 Jul 2024

Publication series

Name	International Geoscience and Remote Sensing Symposium (IGARSS)

Conference

Conference	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
Country/Territory	Greece
City	Athens
Period	7/07/24 → 12/07/24

Keywords

multimodal fusion
remote sensing
semantic segmentation
Semi-supervised learning

Access to Document

10.1109/IGARSS53475.2024.10641451

Cite this

Han, W., Geng, J., Deng, X., & Jiang, W. (2024). Enhancing Multimodal Fusion with only Unimodal Data. In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings (pp. 2962-2965). (International Geoscience and Remote Sensing Symposium (IGARSS)). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IGARSS53475.2024.10641451

@inproceedings{9958fb9779bc4414b6d77bea6cae1d35,

title = "Enhancing Multimodal Fusion with only Unimodal Data",

abstract = "With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.",

keywords = "multimodal fusion, remote sensing, semantic segmentation, Semi-supervised learning",

author = "Wenqi Han and Jie Geng and Xinyang Deng and Wen Jiang",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 ; Conference date: 07-07-2024 Through 12-07-2024",

year = "2024",

doi = "10.1109/IGARSS53475.2024.10641451",

language = "英语",

series = "International Geoscience and Remote Sensing Symposium (IGARSS)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2962--2965",

booktitle = "IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings",

}

Han, W, Geng, J , Deng, X & Jiang, W 2024, Enhancing Multimodal Fusion with only Unimodal Data. in IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings. International Geoscience and Remote Sensing Symposium (IGARSS), Institute of Electrical and Electronics Engineers Inc., pp. 2962-2965, 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024, Athens, Greece, 7/07/24. https://doi.org/10.1109/IGARSS53475.2024.10641451

Enhancing Multimodal Fusion with only Unimodal Data. / Han, Wenqi; Geng, Jie ; Deng, Xinyang et al.
IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2024. p. 2962-2965 (International Geoscience and Remote Sensing Symposium (IGARSS)).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Enhancing Multimodal Fusion with only Unimodal Data

AU - Han, Wenqi

AU - Geng, Jie

AU - Deng, Xinyang

AU - Jiang, Wen

PY - 2024

Y1 - 2024

N2 - With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.

AB - With recent advances in remote sensing technology, a wealth of multimodal data is available for applications. However, considering the domain differences between multimodal data and the alignment challenges in practical applications, it becomes important and challenging to integrate these data effectively. In this paper, we propose a multimodal prototype representation fusion network (MPRFN) for SAR and optical image fusion segmentation. Specifically, a more robust multimodal feature representation is provided by constructing multimodal category prototype representations that better capture the characteristics and distribution of each data. Meanwhile, a prototype-consistent semi-supervised learning method is proposed to improve the effectiveness of multimodal fusion semantic segmentation using a large number of unlabelled unimodal SAR images. Experiments on SAR and optical multimodal datasets show that the proposed method achieves state-of-the-art performance.

KW - multimodal fusion

KW - remote sensing

KW - semantic segmentation

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85204922172&partnerID=8YFLogxK

U2 - 10.1109/IGARSS53475.2024.10641451

DO - 10.1109/IGARSS53475.2024.10641451

M3 - 会议稿件

AN - SCOPUS:85204922172

T3 - International Geoscience and Remote Sensing Symposium (IGARSS)

SP - 2962

EP - 2965

BT - IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024

Y2 - 7 July 2024 through 12 July 2024

ER -

Enhancing Multimodal Fusion with only Unimodal Data

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this