Abstract
Advancements in multimodal remote sensing have enhanced image interpretation by enabling the integration of complementary information from heterogeneous sensors. However, in real-world scenarios, modality availability is often inconsistent due to sensor limitations and environmental factors, leading to incomplete data across regions. In this article, we propose a novel Multimodal Heterogeneous Hypergraph Learning (MHHL) approach for incomplete semantic segmentation of remote sensing images. The proposed MHHL framework constructs a heterogeneous hypergraph model to represent the complex relationships among modality combinations under incomplete conditions. It further establishes a graph neural network-driven transductive learning framework to enable mutual learning and feature refinement across different modality combinations. The framework performs high-order relational modeling through a hypergraph and utilizes the representation aggregation capabilities of graph neural networks to dynamically update hypernode features, thereby constructing a comprehensive multimodal feature space and effectively aligning feature distributions across incomplete modalities. Additionally, the framework introduces global multimodal feature modeling and a multimodal hybrid feature discriminator to address the feature shift problem caused by varying modality combinations. Experimental results demonstrate that our MHHL model significantly outperforms other relevant deep learning networks in terms of accuracy for incomplete multimodal semantic segmentation.
| Original language | English |
|---|---|
| Article number | 5639915 |
| Journal | IEEE Transactions on Geoscience and Remote Sensing |
| Volume | 63 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Incomplete multimodal learning
- multimodal fusion
- remote sensing
- semantic segmentation