MSSF-Net: A Multimodal Spectral–Spatial Feature Fusion Network for Hyperspectral Unmixing

Wei Gao, Yu Zhang, Youssef Akoudad, Jie Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Hyperspectral unmixing (HU) aims to decompose mixed pixels in remote sensing imagery into material-specific spectra and their respective abundance fractions. Recently, autoencoders (AEs) have made significant advances in HU due to their strong representational capabilities and ease of implementation. However, relying exclusively on feature extraction from a single-modality hyperspectral image (HSI) can fail to fully utilize both spatial and spectral information, thereby limiting the ability to distinguish objects in complex scenes. To address these limitations, we propose a multimodal spectral-spatial feature fusion network (MSSF-Net) for enhanced HU. The MSSF-Net adopts a dual-stream architecture to extract feature representations from complementary input modalities. Specifically, the hyperspectral subnetwork leverages a convolutional neural network (CNN) to capture spatial information, while the light detection and ranging (LiDAR) subnetwork incorporates an enhanced channel attention mechanism (ECAM) to capture the dynamic changes in spatial information across different channels. Furthermore, we introduce a cross-modal fusion (CMF) module that integrates spectral and spatial information across modalities, leading to more robust feature representations. Experimental results indicate that the MSSF-Net significantly outperforms existing traditional and deep learning (DL)-based methods in terms of unmixing accuracy.

Original languageEnglish
Article number5511515
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume63
DOIs
StatePublished - 2025

Keywords

  • Attention
  • autoencoder (AE)
  • deep learning (DL)
  • hyperspectral unmixing (HU)
  • multimodal remote sensing image (MRSI)

Fingerprint

Dive into the research topics of 'MSSF-Net: A Multimodal Spectral–Spatial Feature Fusion Network for Hyperspectral Unmixing'. Together they form a unique fingerprint.

Cite this