TY - JOUR
T1 - Multiscale and Cross-Level Attention Learning for Hyperspectral Image Classification
AU - Xu, Fulin
AU - Zhang, Ge
AU - Song, Chao
AU - Wang, Hui
AU - Mei, Shaohui
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Transformer-based networks, which can well model the global characteristics of inputted data using the attention mechanism, have been widely applied to hyperspectral image (HSI) classification and achieved promising results. However, the existing networks fail to explore complex local land cover structures in different scales of shapes in hyperspectral remote sensing images. Therefore, a novel network named multiscale and cross-level attention learning (MCAL) network is proposed to fully explore both the global and local multiscale features of pixels for classification. To encounter local spatial context of pixels in the transformer, a multiscale feature extraction (MSFE) module is constructed and implemented into the transformer-based networks. Moreover, a cross-level feature fusion (CLFF) module is proposed to adaptively fuse features from the hierarchical structure of MSFEs using the attention mechanism. Finally, the spectral attention module (SAM) is implemented prior to the hierarchical structure of MSFEs, by which both the spatial context and spectral information are jointly emphasized for hyperspectral classification. Experiments over several benchmark datasets demonstrate that the proposed MCAL obviously outperforms both the convolutional neural network (CNN)-based and transformer-based state-of-the-art networks for hyperspectral classification.
AB - Transformer-based networks, which can well model the global characteristics of inputted data using the attention mechanism, have been widely applied to hyperspectral image (HSI) classification and achieved promising results. However, the existing networks fail to explore complex local land cover structures in different scales of shapes in hyperspectral remote sensing images. Therefore, a novel network named multiscale and cross-level attention learning (MCAL) network is proposed to fully explore both the global and local multiscale features of pixels for classification. To encounter local spatial context of pixels in the transformer, a multiscale feature extraction (MSFE) module is constructed and implemented into the transformer-based networks. Moreover, a cross-level feature fusion (CLFF) module is proposed to adaptively fuse features from the hierarchical structure of MSFEs using the attention mechanism. Finally, the spectral attention module (SAM) is implemented prior to the hierarchical structure of MSFEs, by which both the spatial context and spectral information are jointly emphasized for hyperspectral classification. Experiments over several benchmark datasets demonstrate that the proposed MCAL obviously outperforms both the convolutional neural network (CNN)-based and transformer-based state-of-the-art networks for hyperspectral classification.
KW - Hyperspectral image (HSI) classification
KW - multihead self-attention (MHSA)
KW - multiscale convolution (MSC)
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85147202750&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2023.3235819
DO - 10.1109/TGRS.2023.3235819
M3 - 文章
AN - SCOPUS:85147202750
SN - 0196-2892
VL - 61
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5501615
ER -