Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer

Shaohui Mei, Chao Song, Mingyang Ma, Fulin Xu

Research output: Contribution to journalArticlepeer-review

228 Scopus citations

Abstract

Hyperspectral image (HSI) classification is a critical task with numerous applications in the field of remote sensing. Although convolutional neural networks have achieved remarkable success in computer vision, they are still limited in the ability to model long-term dependencies due to small receptive fields. Recently, vision transformers have been used in HSI classification, where multi-head self-attention (MHSA), as the key feature extractor of transformers, learns global dependencies in long-range positions and bands of HSI pixels. Existing vision transformers for classifying HSIs with a large number of bands, however, have some limitations in that features extracted by MHSA may exhibit over-dispersion. In this article, we propose a Group-Aware Hierarchical Transformer (GAHT) for HSI classification, which confines MHSA to the local spatial-spectral context by introducing a new grouped pixel embedding (GPE) module. The GPE emphasizes local relationships within HSI spectral channels, resulting in a global-local fashion from a spatial-spectral context for HSI classification. In addition, we construct our transformer in a hierarchical manner, which can significantly improve classification accuracy with only a few parameters. Extensive experiments on four benchmark HSI datasets demonstrate that the proposed method outperforms state-of-the-art HSI classification algorithms. The source code is available at https://github.com/MeiShaohui/Group-Aware-Hierarchical-Transformer.

Original languageEnglish
Article number5539014
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume60
DOIs
StatePublished - 2022

Keywords

  • Grouped pixel embedding (GPE)
  • hierarchical transformer
  • hyperspectral image (HSI) classification
  • multi-head self-attention (MHSA)

Fingerprint

Dive into the research topics of 'Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer'. Together they form a unique fingerprint.

Cite this