Prototypical Rebalancing Network With Semantic Alignment for Multimodal Remote Sensing Image Classification

Research output: Contribution to journalArticlepeer-review

Abstract

The effective utilization of multimodal heterogeneous data can significantly improve the accuracy of remote sensing land-cover classification. However, simply introducing extra network structures for fusion causes modality imbalance, where the dominant modality interferes with the learning rate and update direction of others, limiting effective multimodal utilization and classification performance. To better leverage multimodal features, we propose a prototypical rebalancing network with semantic alignment (PRSANet) for multimodal remote sensing image classification. Specifically, to effectively fuse complementary multimodal information and constrain the optimization direction of each modality, a semantic alignment-based graph fusion module is proposed, which enhances the correlation between the fused features and land cover categories. This module promotes the convergence of multimodal branches toward consistent semantic representations. Meanwhile, a prototypical rebalancing module is proposed, which constructs a nonparametric classifier based on category prototypes to calculate an imbalance factor, designed for a quantitative evaluation of the optimization degree of each modality. Then, based on this imbalance factor, an intermodal independent prototype loss is designed to enhance the performance of slow-learning modalities and guide their update direction. Experimental results on three heterogeneous datasets demonstrate that the proposed method achieves significant performance in multimodal land cover classification tasks.

Original languageEnglish
Pages (from-to)27582-27596
Number of pages15
JournalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Volume18
DOIs
StatePublished - 2025

Keywords

  • Imbalance factor
  • modality rebalancing
  • multimodal fusion classification
  • semantic alignment

Fingerprint

Dive into the research topics of 'Prototypical Rebalancing Network With Semantic Alignment for Multimodal Remote Sensing Image Classification'. Together they form a unique fingerprint.

Cite this