TY - JOUR
T1 - Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer
AU - Mei, Shaohui
AU - Song, Chao
AU - Ma, Mingyang
AU - Xu, Fulin
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Hyperspectral image (HSI) classification is a critical task with numerous applications in the field of remote sensing. Although convolutional neural networks have achieved remarkable success in computer vision, they are still limited in the ability to model long-term dependencies due to small receptive fields. Recently, vision transformers have been used in HSI classification, where multi-head self-attention (MHSA), as the key feature extractor of transformers, learns global dependencies in long-range positions and bands of HSI pixels. Existing vision transformers for classifying HSIs with a large number of bands, however, have some limitations in that features extracted by MHSA may exhibit over-dispersion. In this article, we propose a Group-Aware Hierarchical Transformer (GAHT) for HSI classification, which confines MHSA to the local spatial-spectral context by introducing a new grouped pixel embedding (GPE) module. The GPE emphasizes local relationships within HSI spectral channels, resulting in a global-local fashion from a spatial-spectral context for HSI classification. In addition, we construct our transformer in a hierarchical manner, which can significantly improve classification accuracy with only a few parameters. Extensive experiments on four benchmark HSI datasets demonstrate that the proposed method outperforms state-of-the-art HSI classification algorithms. The source code is available at https://github.com/MeiShaohui/Group-Aware-Hierarchical-Transformer.
AB - Hyperspectral image (HSI) classification is a critical task with numerous applications in the field of remote sensing. Although convolutional neural networks have achieved remarkable success in computer vision, they are still limited in the ability to model long-term dependencies due to small receptive fields. Recently, vision transformers have been used in HSI classification, where multi-head self-attention (MHSA), as the key feature extractor of transformers, learns global dependencies in long-range positions and bands of HSI pixels. Existing vision transformers for classifying HSIs with a large number of bands, however, have some limitations in that features extracted by MHSA may exhibit over-dispersion. In this article, we propose a Group-Aware Hierarchical Transformer (GAHT) for HSI classification, which confines MHSA to the local spatial-spectral context by introducing a new grouped pixel embedding (GPE) module. The GPE emphasizes local relationships within HSI spectral channels, resulting in a global-local fashion from a spatial-spectral context for HSI classification. In addition, we construct our transformer in a hierarchical manner, which can significantly improve classification accuracy with only a few parameters. Extensive experiments on four benchmark HSI datasets demonstrate that the proposed method outperforms state-of-the-art HSI classification algorithms. The source code is available at https://github.com/MeiShaohui/Group-Aware-Hierarchical-Transformer.
KW - Grouped pixel embedding (GPE)
KW - hierarchical transformer
KW - hyperspectral image (HSI) classification
KW - multi-head self-attention (MHSA)
UR - http://www.scopus.com/inward/record.url?scp=85139414146&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2022.3207933
DO - 10.1109/TGRS.2022.3207933
M3 - 文章
AN - SCOPUS:85139414146
SN - 0196-2892
VL - 60
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5539014
ER -